Version 1.9 July 31, 2003

Open Science Grid

We propose that the DOE and the NSF endorse a roadmap for the U.S. to build a national grid infrastructure for science: the Open Science Grid.

We propose a rapidn program of work to federate many of the currently separate grid resources at labs and universities into a single scalable, engineered, and managed grid.

Starting with the U.S. LHC grid resources a Peta-scale production quality Grid, that extends internationally to create a global Grid for LHC science, will be built and operated. This Grid will immediately serve as a backbone to merge grid computing efforts of other experiments in particle and nuclear physics, and can rapidly be extended to other scientific communities.

The Open Science Grid will



Roadmap to the Open Science Grid

The LHC is an exemplar global science project supported by a partnership of DOE and NSF. It is exemplary in its need for, and basis in worldwide collaboration, and for its drive to use grids on a global scale to enable its scientific mission. The LHC experiments are driving grid technology in order to be able to share the costs and burdens of the immense computing and storage resources needed, and more importantly, to enable the scientists working on the experiments to fully participate in the science from the far corners of the globe.

U.S.-developed grid middleware and emerging grid services1 are at the base of all the LHC grid work, and up to now the U.S. prototypical and experiment-specific grids2 have led the way with prototypes and demonstrations of how the global computing and data grid system can be built.

Much of the LHC grid infrastructure in Europe will be provided by a combination of CERN central resources and a consortium of European centers that propose3 to federate some of their resources in a grid for e-science in Europe. Centers in the U.S. and other parts of the world will federate with this European grid infrastructure in order to provide the global computing grid for LHC science.

It is now time for the U.S. to also federate its LHC computing resources and in doing so to continue to lead the efforts towards a global grid for LHC science. We propose to provide and operate these resources at the national laboratories and universities as the initial seed for the Open Science Grid.

The work of federating U.S. computing resources into a scalable, well-engineered and managed production quality grid for LHC science will allow us to build the Open Science Grid as a new national grid infrastructure serving thousands of scientists.

In building this new and bold computational infrastructure for science, the U.S. will be in a global leadership position for computational and data intensive science, and the Open Science Grid will benefit a large number of stakeholders in fundamental and new ways. This will have a broad impact on research in many fields and shape the future use of grids in industry.

As each experiment-specific set of applications, grid services and computing fabrics are incorporated into the Open Science Grid we will enrich and extend this national grid. The initial focus will be physics, serving the needs of existing and next generation programs. The Open Science Grid that will emerge will provide an infrastructure into which other science can be similarly incorporated, forming the basis for a global computing grid for science.

The Open Science Grid creates a seamless environment for experiment collaborations and applications: a virtual computing service and a ubiquitous responsive work environment for scientists. It enables individuals and groups of researchers at universities throughout the U.S. to become full participants in a new generation of worldwide science projects, and lets them collaborate with science groups around the world, lowering the entry barrier for smaller communities to use grids.

The Open Science Grid will benefit a range of U.S. organizations that are providing computing for science: university groups and computing centers, national laboratories, and providers from the private sector. It will enable them to form and participate in a coherent national computing infrastructure. It will let them provide and manage their computing facilities across their broader program and will allow them to profit from the economies made possible by sharing expertise and support.

The Open Science Grid will build upon R&D grid efforts1, 2, 3, utilizing services provided by existing grid organizations4 and facilities provided by allied computing consortia5.

Work Plan

In the roadmap to build the Open Science Grid we foresee several phases. The Open Science Grid will grow by adding experiments and scientific communities, thus expanding the number of participating sites, and by collaborating with other grid computing infrastructures such as the TeraGrid and the DOE Science Grid.

The initial phase of the Open Science Grid will federate the LHC physics applications’ grid services and computing resources in the U.S. into a global grid system, engineered and managed to serve the needs of the LHC scientific program. National laboratories and universities participating in the U.S. LHC software, computing and network development efforts will form the initial sites with special roles for the U.S. LHC Tier-1 centers at BNL and Fermilab.

In the next phases, applications from other physics communities, specifically Run II experiments at Fermilab, RHIC experiments at BNL, and BaBar at SLAC, will move their resources and applications to the Open Science Grid. Other experiments and communities, such as our PPDG and iVDGL partners, will join and further extend the Open Science Grid and, in subsequent phases, other non-physics science applications will be included.

Each application will bring their dedicated computing resources to be federated with the Open Science Grid. We expect that the initial costs of integrating a new application will be partially offset through economies of scale, although considerable up-front investment will be necessary to reap the eventual benefits and cost savings of a robust shared national grid infrastructure.

To build the

Open Science Grid the following work elements will be needed



The metrics of success for this program will be specific performance indicators to be defined for each of the applications, as well as the ability to run on resources not “owned” by the Application thus making effective use of the Open Science Grid.





Budget

A budget of between $40M and $50M spread over 4 years is estimated to be needed to build the core Open Science Grid and to provide some funding for studies and functional demonstrators for several of the candidate applications that would join in phases two and beyond. This is a remarkably small investment to achieve such an ambitious national grid infrastructure, taking into consideration several important factors:



To assure success there must be a constant effort of approximately 15 FTEs applied to the work elements 1-4. Management (4 FTEs), Engineering (4 FTEs), International Coordination (4 FTEs) and Education and Outreach (3 FTEs). ($2M/year total)

In order to migrate the experiment-specific grid services to the Open Science Grid some core servers and services owned and operated by the federation, rather than simply contributed by one of the partners, will have to be put in place and operated as a robust round the clock service ($2M/year).

The initial phase work of federating the LHC resources and migrating both CMS and ATLAS applications to a common U.S. Grid is estimated to require approximately $4 M in the first year in development costs and subsequently between $6M and $7.5M per year in ongoing integration and operations costs.

In parallel, opportunities to consolidate services and integrate non-LHC experiments and resources should be studied and demonstrated ($750K/year).

The additional funding necessary to fully migrate a particular experiment or application area in phases two and beyond will then be better understood and detailed proposals for this will have to be developed.

1 Globus (http://globus.org), Condor (), and others

2 Through the experiments’ test-beds and the US Physics Grid projects PPDG (http://ppdg.net/ ), iVDGL (http://ivdgl.org/ ), and GriPhyN ( )

3 Proposal “Enabling Grids for E-Science and Industry in Europe, http://egee-ei.web.cern.ch/egee-ei/New/Home.htm, a $33M€ proposal to the European Commission, over 2 years, starting in 2004

4 e.g. the DOE Science Grid (http://doesciencegrid.org)

5 e. g. the TeraGrid project (http://www.teragrid.org/)

6 The Grid-I functional Grid demonstration (http://www.ivdgl.org/grid3/) is moving the initial production environments of U.S. LHC onto a Grid based on common middleware, the VDT (http://lsc-group.phys.uwm.edu/vdt/).