For the Solenoid Tracker At RHIC (STAR) collaboration & The Information Technology Division (ITD)
Grid
Collaboratory Pilot activities & iVDGL expression of interest
Expression of interest in iVDGL participation
The BNL[1]’s STAR[2] experiment and ITD[3] have teamed together to achieve several important tasks on the path toward consolidating their grid infrastructure and fully deploying a Grid environment by 2005 for the BNL community. However, much is still to be done. In order to complete such an environment, including its operational aspects, it is appropriate for us to explore new approaches to sharing and competitiveness, and to seek new partnerships within the field and with those outside the field. Furthermore, long term planning and comprehensive Grid strategy is an essential component to our success and within this scope, this document is an opening to partnership with iVDGL.
STAR & ITD Grid activities
The STAR experiment is one of the four Relativistic Heavy Ion Collider (RHIC) experiment at the BNL. Unlike traditional Nuclear Physics experiment, the STAR experiment records each year massive amount of data: up to a PByte of data a year (1012 bytes) is being accumulated representing to date an integrated total size of 3 PB spanning over 3 million files.
The projection for the next RHIC run (also called, Year4 run which will start by the end of 2003), shows and increase by a factor of five on the number of collected events and brings our production time turn around, using traditional methods, to the next order of magnitude: from month to year and within a very active and aggressive physics program, the local available processing resources will be severely challenged for the science to be delivered in a reasonable time scale. The situation will become more and more problematic as our Physics program evolves toward the search for rare probes: the currently decadal plan (next 10 years STAR activities development) clearly described the need for several upgrade phases including a factor of 10 in data taking and throughput by 2007.
Data management (including making available the second pass of production physics summary tape to our remote institutions) and pass0 facility production become problematic. However, the STAR computing strategy has envisioned and planned to cope for those difficulties by engaging itself in Grid activities and the use of middle-ware tools to resolve some of those issues. However, it becomes clear that we must expand and reach other Grid collaboration for us to rapidly converge and consolidate the infrastructure.
STAR is already a member of the Particle Physics Data Grid (PPDG collaboration) and its mission statement includes dissemination of knowledge and Grid awareness at BNL and has engaged in bringing the local experiments and teams together toward the realization joint activities. In this regard, STAR has initiated and carried along many cross experiment (Phenix/RHIC, Atlas), department (ITD) or Grid related projects and activities beyond its limits (J-Lab, NERSC
The Information Technology Division
(ITD)
A major component of ITD’s mission is to
support the scientific research and advanced technology initiatives of the
Laboratory by:
·
Providing
a reliable and secure high speed networking structure for scientific and
business computing.
·
Developing
a scientific computing infrastructure that can support the computing needs that
are common among the scientific departments.
·
Providing
support to scientific programs to help them address their unique computing
problems.
·
Providing
a cost-effective, highly reliable, secure and standard computing infrastructure
for business and administrative communication.
·
Promoting
best practices in the areas of hardware, software, training and support.
ITD’s Scientific Computing
Support Section provides facilities and expertise to enable and improve the
computational science posture of the Laboratory.
ITD also collaborates with the RCF, STAR and Atlas on their growing Grid. The ITD's role is to bring the technology and lessons learned from incorporating GRID into the activities of these projects to the other scientific disciplines at BNL and to ensure compliance with DOE regulations.
Since the beginning of its formation, the STAR collaboration has been part of the PPDG Collaboratory Pilot. Although financially lightly supported, the progress and activities have been substantial.
Our past year's activities
have been clearly focused on the realization of a production Grid
infrastructure and deployment of a set of tools without disrupting our main
RHIC program that is our primary mission to the DOE, namely, delivering science
to our NSF and/or DOE institutions/university and the scientific community at
large. Our main strategy was therefore oriented toward the introduction of either
additional (low risk) tools or providing to the user stable and robust APIs
shielding our scientists from the fabric details.
A few activities can
illustrate the strategy and path we have taken:
In addition, several
activities were started in as collaboration between STAR and ITD as well as
with remote Computer Scientists and CS research groups such as the SDM[7]
group at LBNL (reliable data transfer using SRM[8],
Replica Registration Service or RRS to name only those) . For the coming year,
we in fact intend to
Moreover, we believe we will
be in a position to open the Grid to our users for running analysis jobs by
2004 and to this end, will need a user certificate and VO management system,
functional grid-ware database solutions, logger utilities at application level,
error recovery and job tracking system some of which are part of ongoing
investigative activities.
We feel that the
incorporation of the Grid into the daily activities of such a large experiment
will provide a path for other scientific groups at BNL to follow as the Grid
technologies permeate the scientific computing landscape.
Production level data replication across the US Grid has been demonstrated as being a successful first step, but data production/data mining remains at an embryonic stage (at the developer level) and requires hardening and interoperability before a generalized use amongst the user community can occur. The ongoing RHIC Physics program, with increasing data throughput, drives the need for the use of distributed resources available across an international set of collaborating institutions. In fact, no less than four institutions will join the STAR Grid efforts in the coming months and put us in a good position to test the many facets of the actual grid infrastructure that is, not only under a controlled Monte-Carlo like driven testing but also at user analysis job level and interactive analysis using the Grid.
The STAR/ITD collaboration would greatly benefit from joining the iVDGL project and we believe this benefit will be mutual.
· Several ongoing activities are overlapping with iVDGL interest (monitoring, testing and/or deployment of VDT as a common toolkit, use of MDS etc…) and bringing our experience and feedback to the source of the activities, while benefiting from iVDGL’s greater awareness of direction and development, will lead to a net benefit by avoiding the duplication of efforts.
· IVDGL has extensive experience in packaging and distribution of standard toolkits. We hope to learn from this experience in our distribution of software components to organizations both on and off BNL.
· While iVDGL is aimed toward driving the Grid to an every day production use of Petabyte scale, we believe that, as an active on going experiment, STAR can truly bring this goal to reality: we are an experiment with real PetaByte scale data samples every year, and since we are resource constrained we do need to migrate both our production and user analysis to a Grid infrastructure. We believe that existing experiments, with a large number of potential early users of the Grid, offer a superb and immediately available test-bed for Grid infrastructure and developments.
· We believe we can play an important role in projects such as Grid3. While shaped around providing infrastructure and services for LHC-oriented production, we firmly believe that early running experiments offer a unique (user) environment in which base components of the Grid can be stress tested and therefore, allow for catching possible problems at an early stage. We see this as a major strength of our possible collaborative efforts as robustness and scalability issues of the components can be immediately assessed by our community.
· Sharing the same facilities with other experiments, participation to existing collaboratory pilot projects is fundamental to us as we can cross benefit from development, experience and manpower put into other areas, allowing for a faster convergence toward our Grid activity goals.
· Furthermore, we view the participation of ITD as a crucial ingredient toward a long term successful Grid deployment. ITD has showed a strong interest in Grid activities and joined the STAR team in the PPDG effort. In order to broaden the scope of Grid activities outside the Physics department where it is currently confined, and to develop a real Brookhaven Grid, we believe a strong participation of the STAR/ITD team in iVDGL activities to be an essential step toward establishing a driving force which will propel us all toward success. It is clear in our mind that since the Grid activities are pushing the frontier of networking, security, software application in general, it is natural to include in the long term planning our local IT team having both most experience in dealing with a large heterogeneous community. Participation in issues such as security is essential to ensure compatibility between the Grid ideals and the DOE cyber-security laboratory regulations. As a facility support group, participation and teaming with iVGDL / iGOC is natural.
The STAR and ITD teams have already begun to work on a production grid environment at BNL. At this point in our development, we feel it appropriate to seek out and join collaborations such as iVDGL for the mutual benefit of all. We look forward to working with iVDGL in the near future.
[1] Brookhaven National Laboratory http://www.bnl.gov/
[2] The Solenoid Tracker at RHIC http://www.star.bnl.gov/
[3] Information and Technology Division
http://infranet.bnl.gov/itd/
[4] Relativistic Heavy Ion Collider
http://www.bnl.gov/RHIC/
[5] Center for Data Intensive Computing
http://www.bnl.gov/CDIC/
[6] MONitoring Agents using a Large Integrated Services
Architecture http://monalisa.cacr.caltech.edu/
[7] Scientific Data Management Research Group
http://sdm.lbl.gov/
[8] Storage Resource Management http://sdm.lbl.gov/SRM/