You are here: Home » Zero-In » Zero-In Third Issue eMagazine » Grid infrastructures in European research: lessons from EGEE

Grid infrastructures in European research: lessons from EGEE

By Danielle Venton, Enabling Grids for E-sciencE/CERN, Switzerland

Budgets limit research, and computing budgets are often considered last. Grids can help alleviate these cash restraints by pooling distributed resources, thus spreading the computing and cost load. Grids also allow contributing countries to retain their existing resources, rather than contribute new resources to a centralized location.
The grid computing solution grew from the needs of the high energy physics (HEP) community and has been adopted by scientists in scores of disciplines. User communities have often invested much of their own energy and resources to make grid computing a success, convinced of its value for their work and as a framework for collaboration. For example, in forming the Worldwide LHC Computing Grid (WLCG ) – a global collaboration of more than 140 computing centres in 33 countries – the HEP community showed that grids:

  • are an economical way to meet data-management needs (avoiding reapplication to funding agencies)
  • promote collaboration and cooperation, since members receive recognition for contributing resources
  • allow public participation, as seen in volunteer computing initiatives such as LHC@home.

Another example is the grid computing infrastructure managed by EGEE, an e-Infrastructure that allows global and secure access to processing power, data, software and data storage. EGEE’s software is scalable, dynamic, and can be extended as required. In the final year of its third phase, EGEE has set its eyes on sustainability, technology transfer and integration with cloud services.

In the clouds
Cloud computing makes heavy use of virtualization, and grid computing may be able to learn from this approach, enhancing user experience by simplifying operations and building friendlier user interfaces (see ‘Clouds and grids: evolution or revolution?’ ).
Increasingly, ‘proof-of-concept’ developments are investigating ways of integrating cloud-based services with existing grid-enabled applications across a number of disciplines. Projects such as RESERVOIR  and StratusLab  have shown such integration can be positive, but the transition to everyday production requires more work. Potential benefits include simpler application porting and reduced operational costs.
Cloud computing currently entails a number of issues. For example, if resources are located in the United States – the home of many cloud providers –  the U.S. government will have access to users’ data. This may not be acceptable for many international scientific projects. Also, if a cloud provider goes out of business, access to hosted data is not necessarily guaranteed. Transferring data between clouds is also difficult.

Tailoring compute solutions
Most user communities are primarily interested in accessing easy-to-use, powerful and secure data management facilities, without concern for how these facilities are provided. Since each style of e-Infrastructure has advantages and drawbacks, users should be able to select solutions that best suit their needs. Hence commercially operated clouds can be a good solution for users who need additional resources on demand, while users who operate complicated applications may require sophisticated and expensive high-performance computer resources, such as supercomputers. (To opt for supercomputing, users must be prepared for significant financial and engineering investments, since matching their software to the resource’s architecture is generally no small task.)
The interoperability work EGEE has undertaken with supercomputing structures such as DEISA , volunteer grids such as EDGeS , and cloud systems such as DigitalRibbon  has been driven by the needs of users who wish to access a range of resources and systems; such work will surely gain importance in the future. Standards bodies such as Open Grid Forum are also working to simplify interoperability between clouds, grids and supercomputer installations.

Lessoned learned: the planning stage
When building an e-Infrastructure, the planning stage is most important, since it lays the foundation for the rest of the project.
Early planning of the networking layer is critical. In the WLCG, for example, institutions are arranged in ‘tiers’ corresponding to their computing responsibilities; network capacity must match the tier hierarchy. Achieving this required much negotiation and joint work between National Research and Educational Networks (NRENs) and the GEANT project .
Common data formats are important to enable data sharing. If possible, collaborations should adopt existing data standards, which are more likely to have open source and commercial product support. Otherwise, projects must agree on a common set of data formats.
Practice makes perfect: a project should gradually deploy and test its infrastructure. For example, the WLCG organised a series of “service challenges” held over a number of years, each with defined and successively more demanding quality of service targets.

Lessons learned: training
Spreading knowledge about new techniques and technologies is key to unifying and extending the community.
Hands-on tutorials and interactive eLearning events, each tailored to specific needs, are more effective than generic presentations.
Outreach to young researchers, who are often unaware of new research tool, is particularly important.
Training courses are a great opportunity to port home-grown applications; prior to training, many scientists are not aware of the advantages grid technology can bring to their research.

Lessons learned: technology’s human dimension
The grid is not flat: even if all sites are technically equal, they have varying levels of importance to each user community. Importance grows if a site

  • is run by colleagues,
  • is reliable,
  • holds important data sources,
  • has important capacity (the WLCG, for example, has defined a hierarchy of tiers).


People want to help: collaborators are their own strongest allies. Initially the WLCG planned on having just a handful of Tier-1 centres; now it has more than 100 sites, and the resources provided by Tier-2s outweigh that of Tier-1s, at nearly 60%. Grids empower contributors, turning them into stakeholders instead of spectators. Partners take pride in contributing to a project and making it work. The same phenomenon can be seen with volunteer grids such as BOINC’s SETI@home .
Working through cultural differences is an interesting and fundamental process for all international collaborations. For example, the WLCG agreed to publish the state of grid sites, reaching consensus from all collaborators, even those cultures in which transparency, which can be interpreted as criticism, is frowned upon. 

Conclusions
As science changes, e-Infrastructures must also adapt. Since the benefits of grid computing parallel the requirements of many research collaborations, grids will continue to be an effective way to meet data management needs. EGEE has built the world’s largest grid for research science and is the foundational cornerstone of the European Grid Initiative , which will sustain this grid infrastructure for the European Research Area.
The primary value of e-Infrastructures such as EGEE is their role as a framework for collaboration, on both a physical and human level. Grids, when well planned and managed, support and encourage the best science possible.

Zero-In - Issue 3 - 2

LATEST NEWS

20-04-2011 Announcing the CREDES Summer School "Dependable Systems Design", June 2-3, 2011.

This summer school is oragnised at the Tallinn University of Technology and it is supported by the EU REGPOT project CREDES

01-04-2011 EuroAfrica-ICT & eI-Africa Monthly e-Newsletter/ March 2011

The EuroAfrica-ICT and the eI-Africa Partnerships are very pleased to bring to your attention a number of developments in the field of Euro-African collaborative research.


More news...

Enjoy The Digital Library