STAR Computing requirements proposal FY02/FY03


During the Year2 RHIC run, the STAR experiment accumulated over 10 M AuAu events and 25 M pp events. Although this data sample is a factor of 5 larger than in Year1, it was still much lower than we had predicted due to the reduced RHIC physics running time, the rather poor RHIC duty factor, and our data rate limit set by writing data to RCF. As a result, several measurements in STAR's physics program such as Open Charm and event characterization (elliptic flow and HBT) using rare probes are limited by the statistics in the Year2 data samples. To minimize our dependence on RHIC's performance in the upcoming run, STAR's DAQ-100 project was initiated with goals of a 60 MB/sec data rate (sustained), a 5-fold increase in our event rate, and a 40% reduction in our reconstruction time (ideal). The DAQ-100 project is now fully operational and expected to be used during the Year3 run.


The calculations presented here include the effects of the DAQ-100 project as well as effects from new detectors coming on line during the next year. We stress that the increased data volumes projected here from the DAQ-100 project are an attempt to recover the full physics program not yet realized due to RHIC past performance and to protect our program from future limitations of RHIC operations. We have made three scenari estimate taking into account FY02 and FY03 funds merging ; only one is proposed as an achievable goal.


The DAQ-100 scenario extends our data deployment model of the current Year2 datasets through the analysis FY03 data. This scenario shows 100% of DSTs and 25% of our DSTs on centralized disks and represents a known (safe) model for delivering data to physics analysis codes. The result shows a 280TB disk space deficit. This deficit could be reduced to about 160TB by removing the disk residency of the DST data ; however, even with this reduction, we are 2 million dollars short of our budget requirements and a factor of 10 off our storage needs.


The second scenario (Constrained 1) reduced the number of passes, increased the compression level raw/DST, and assumed 50% (5%) of our DST (DST) on central storage. It did not allow us to recover from the downfalls unless the processing time was spanned over 3 years (at the expense of our physics goals), mainly running on our existing resources. This is obviously not viable since our user farm is currently congested and cries for more CPU power in addition to the consequences from such a long turn around for data analyses.


Finally, Constrained 2 scenario, or the "challenge" scenario relies on a more risky distributed disk model for covering our disk space requirements without leaving us without processing power. The approach balances our CPU and disk space needs within a 37-week production period (an assumed 80% RCF farm uptime was also folded in). This approach entirely relies on a non-existing STAR infrastructure components and a robust and heavily accessed HPSS. However, it seems to be our only alternative given the current budget restrictions.


A factor of 4 reduction in the RHIC efficiency from the assumptions in these calculations (as did occur last year) would not translate to a factor of 4 reduction in our numbers. DAQ-100 allows us to buffer more events for later transfer during RHIC downtimes; the net effect being at worst a factor of 2 reduction in the data volumes and CPU resource requirements presented here. We therefore feel that, to reach our physics objectives in a reasonable time frame, there is a need for extended fundings. However, we acknowledge that these numbers do depend on factors beyond our control such as RHIC and RCF performance. Finally, we would like to mention, once again, that HPSS reliability will become an even more critical requirement than ever before, our distributed disk model's success depending on it.