High-speed GÉANT and JANET research networks enable global collaboration on 1000 Genomes Project
Cambridge, UK, June 9 2011 - The European Molecular Biology Laboratory's European Bioinformatics Institute (EMBL-EBI), which hosts the world's largest public collection of molecular biology databases, is using the pan-European GÉANT research network and JANET, the UK research network, to help biologists share vital data across the globe.
EMBL-EBI relies on the fast, secure transmission of large amounts of biological data between its campus in rural Cambridgeshire and its many partners around the world. The EMBL-EBI website receives more than 3.5 million information requests every day, and over 80 terabytes of data is transmitted by EMBL-EBI over the high-speed, high-bandwidth JANET and GÉANT networks every month. JANET transports EMBL-EBI information within the UK; GÉANT then communicates it to national networks around Europe, to the US via links with Internet2 and to China via links with CERNET.
EMBL-EBI is a key partner in many global initiatives. One of these, the 1000 Genomes Project, is sequencing the genomes of 2500 people around the world and studying the minute differences that make people unique. The knowledge generated in this project is being used to advance our understanding of human health by explaining genetic susceptibility to disease or responses to particular drugs, for example. Chaired by Richard Durbin at the Wellcome Trust Sanger Institute in the UK and David Altshuler at the Broad Institute in the US, the Project includes participants from Europe, the Americas and Asia. Researchers at EMBL-EBI are responsible for creating a strategy to characterise these variations, and for creating the bioinformatics infrastructure to support the massive movement of these data.
The pilot phase of the 1000 Genomes Project, completed in 2010, created 4.9 terabases of DNA sequence and uncovered 8 million variations that had never been seen before. By its completion in 2012, the project is expected to produce between 60 and 80 terabases of data - the equivalent of around 250,000 gigabytes of data.
Information currently flows to EMBL-EBI from seven sequencing centres in China, Germany, the UK and the US; it is then mirrored to the National Human Genome Research Institute (NHGRI) in the US and supplied to 40 groups across the world for initial analysis. Final datasets are then accessed by another 100 groups of researchers.
"Data generated by biological experiments is doubling every five months, driven by leading-edge initiatives such as the 1000 Genomes Project. Our mission at EMBL-EBI is to make the results of these international collaborations freely available to the scientific community wherever they are located," said Dr Paul Flicek, Head of Vertebrate Genomics at EMBL-EBI. "To do this we need an infrastructure that is robust, flexible and high performance, linking us to our partners across the globe. Our close working relationship with the JANET and GÉANT networks delivers the speed and capacity that we need, giving us confidence and allowing us to focus on sharing data to push forward scientific progress."
By 2020, biological data generation is expected to reach thousands of times the current rate. This growth far exceeds predicted increases in storage capacity, meaning current models of centralised data resources will not be able to cope. ELIXIR, an ESFRI (European Strategy Forum on Research Infrastructures) project of global significance, aims to create a stable infrastructure for biological data in Europe that distributes information across multiple locations while making it available to researchers wherever they are. ELIXIR is co-ordinated by EMBL-EBI and is currently entering its construction phase. Once created, ELIXIR will rely on high-speed networks such as GÉANT, JANET and other national research networks across Europe to deliver data in real time to scientists wherever they may be.
"Mapping the DNA of thousands of organisms, including the human genome, is leading to breakthroughs in medical research that can potentially deliver better health outcomes for people across the world," said Matthew Scott, General Manager of DANTE, the organisation which on behalf of Europe's National Research and Education Networks (NRENs) has built and operates the GÉANT network. "The European Bioinformatics Institute is leading the way by making these complex, large datasets freely available to international researchers. EMBL-EBI's use of GÉANT and JANET is at the heart of its mission to create and share information by providing direct access to its vast range of biological data resources and tools. With research now relying on fast access to distributed data resources, high speed, robust networks are at the heart of pushing back the frontiers of scientific knowledge."
About the European Bioinformatics Institute
The European Bioinformatics Institute (EBI) is part of the European Molecular Biology Laboratory (EMBL) and is located on the Wellcome Trust Genome Campus in Hinxton near Cambridge, UK. The EBI grew out of EMBL's pioneering work in providing public biological databases to the research community. It hosts some of the world's most important collections of biological data, including DNA sequences (EMBL-Bank), protein sequences (UniProt), animal genomes (Ensembl), three-dimensional structures (the Protein Data Bank in Europe), data from gene expression experiments (ArrayExpress), protein-protein interactions (IntAct) and pathway information (Reactome). EMBL-EBI hosts several research groups and its scientists continually develop new analytic tools for the life science community. It is coordinating ELIXIR, a pan-European research infrastructure for biological information. http://www.ebi.ac.uk/
GÉANT is the high speed European communication network dedicated to research and education. In combination with its NREN partners, GÉANT creates a secure, high speed research infrastructure that serves 40 million researchers in over 8,000 institutions across 40 European countries. Operating at speeds of up to 40 Gbps, GÉANT is the world's largest and most advanced multi-gigabit network dedicated to research and education. Building on the success of its predecessors, GÉANT has been created around the needs of users, providing flexible, end to end services that transform the way that researchers collaborate. GÉANT is at the heart of global research networking through wide ranging connections with other world regions, underpinning vital projects that bridge the digital divide and benefit society as a whole.
Co-funded by the European Commission under the EU's 7th Research and Development Framework Programme, GÉANT is the e-Infrastructure at the heart of the EU's European Research Area and contributes to the development of emerging internet technologies. The project partners are 32 European National Research and Education Networks (NRENs), TERENA and DANTE. GÉANT is operated by DANTE on behalf of Europe's NRENs. For more information, visit http://www.geant.net
DANTE is a non-profit organisation, coordinator of large-scale projects co-funded by the European Commission, and working in partnership with European National Research and Education Networks (NRENs) to plan, build and operate advanced networks for research and education. Established in 1993, DANTE has been fundamental to the success of pan-European research and education networking. DANTE has built and operates GÉANT, which provides the data communications infrastructure essential to the success of many research projects in Europe. DANTE is involved in worldwide initiatives to interconnect countries in the other regions to one another and to GÉANT. DANTE currently manages projects focussed on the Mediterranean, Asia-Pacific and central Asia regions through the EUMEDCONNECT, TEIN and CAREN projects respectively. For more information, visit www.dante.net.
The purpose of ELIXIR is to develop the plan for a sustainable infrastructure for biological information in Europe. This plan focuses on generating stable funding for Europe's most important publicly accessible databases of molecular biological information, and the development of a compute infrastructure that can cope with the biological data deluge. ELIXIR is one of 44 research infrastructures recommended by the European Strategy Forum for Research Infrastructures (ESFRI, http://cordis.europa.eu/esfri/) as being of key strategic importance to Europe's future. ELIXIR holds a special place among these because it will provide infrastructure for the other biological, medical and environmental research infrastructures being developed. ELIXIR will provide: data resources; bio-compute centres; an infrastructure for integration of biological data, software tools and services throughout and beyond Europe; support for other European infrastructures in biomedical and environmental research; and services for the research community, including training and standards development. This will enable ELIXIR's users to meet the European Grand Challenges, which are almost all biological, namely: healthcare for an aging population, a sustainable food supply, competitive pharmaceutical and biotechnology industries and protection of the environment. To date, Finland, Denmark, Spain, Sweden and the UK have committed funds to ELIXIR and the project is actively seeking the support of other nations. www.elixir-europe.org