PWGC Summary of May, 1999 STAR Computing Meeting

This meeting included a two-day summary of the present state of STAR software infrastructure and plans for future development on Monday and Tuesday. In related sessions on Wednesday and Thursday several computing tutorials were conducted and PWGs held individual and joint meetings.

On Tuesday evening a representative group from the PWGs met informally to consider the state of STAR software as presented in the Monday-Tuesday summary and to prioritize future work from a PWG perspective.

On Friday a wrap-up session was held in which the week's activities were reviewed, the PWG informal meeting results were presented in the plenary session, and a joint prioritization between BNL core computing and the PWGs was conducted. These results are summarized below.
Consult the following page on the computing web pages for additional material.

Thomas Ullrich and Tom Trainor for the PWG conveners

  1. System stability - the working groups appreciate the great effort by BNL core personnel and other collaboration members that has gone into stabilizing the STAR software infrastructure. This achievement will present to nonexperts and new users a software environment which gives good return on their learning investment.

  2. Documentation - was recognized as of leading importance. Two types: entry-level (getting started) for beginners and steady-state (user guide, reference manual) to provide a continuing description of a complex system that will certainly continue to evolve over the next year or two. Robust and simplified entry-level documentation is an immediate action item.
    This is accepted as a very high priority item by STAR core computing. Work is underway, with cooperation between core computing and PWGs, especially designated documentation editors from the PWGs (Peter Jacobs, Lanny Ray and Gene Van Buren). The current bug report system should be used to report incorrect or out-of-date documentation.

  3. Grand Challenge - Access to large data volumes on HPSS will probably not be available via GC infra throughout this calendar year. Given the modest data volumes to be produced in the summer run this may be tolerable. The alternative is data volumes on local STAR disks with an ad-hoc file management system.
    This is accepted as a high-priority item by STAR core computing. Extension of the MySQL data catalogue to a run/event database must be completed as a top priority. Deployment of GC infra by end of year is also a high priority.

  4. Data Bases - this is a rapidly developing area, but much needs to be done very quickly. PWGs need to be substantially involved in DB definition, filling and QA. This is an immediate need in the next weeks. PWGs should meet with DB developers at the earliest opportunity to learn what is planned and what is needed from the physics community.
    First meeting was held June 3 to identify existing DB info sources on the web. Second meeting with DB experts occured June 8.

  5. QA - similar to DBs this is a rapidly developing area. Peter Ja. would like another couple of weeks to formulate a general strategy. Immediate QA activities in whatever form possible are essential, with gradual replacement by a formal system over the next several months.
    PWGs should begin an immediate intensive reexamination of present QA/monitor histogram system managed by Kathy, interpret results and consider updates. First pass at this is tentatively scheduled for June 10 meeting.

  6. Event visualization - There are several approaches presently available and at various stages of development. There are obvious voids in STAR visualization (e.g., on-line data monitoring by event display, more comprehensive raw-data viewer). We do need to plan a comprehensive and integrated visualization system to be available for the November run.
    It is accepted that a critical-path need is on-line monitoring covered by ROOT-based visualization within the general ROOT framework. A priority need is to define and configure the required online displays. DSV handling of STAR data in ROOT-file format is in development.

  7. Large-volume production on CRS - Production operations help will be needed by the Fall run. It is neither possible nor desirable for this to be entirely the responsibility of BNL core people. PWG members will have to learn the details of CRS operation over the coming months.

  8. Raw data files - The current data acquisition scheme in which all physics triggers are collected in one raw-data file for transmission to RCF is regarded as not suitable for efficient STAR event reconstruction. This is especially true for peripheral-collisions events for which data volumes are minimal. One solution would be to split the event types to separate files in the STAR `buffer box,' i.e., before they are transmitted to RCF. Another approach would be to reject the decision to permit only one file type to be opened at a time to receive STAR raw data by RCF, and to revisit this important issue with RCF management. This is is an immediate action item.

  9. Disk organization - The current organization of the disk space available to STAR is regarded as not sufficient. During the meeting we agreed to organize the disk space as outlined below.

    disk size type of data
    disk0 50 GB MDC2 data (~20GB)
    disk1 60 GB T3E production data sinking (Pavel)
    disk00000 100 GB user scratch and Lidias data from daily/weekly tests
    disk00001 100 GB actual data production (DSTs, histos, ...)

    A mechanism has to be installed so that old and unused files are automatically removed (either migrated to tape or deleted).

Near-future plans (from the web page):