In a sea of data, Bioinformatics Resource Center rides genomic wave

July 2, 2013
From a tiny, dried spot of blood, Phil Farrell is diving deep into the genetic mysteries of cystic fibrosis.
A life-threatening congenital disease of the lungs and digestive system, cystic fibrosis is caused by the malfunction of a single gene, which causes the body to produce thick, sticky mucus that clogs the lungs — sometimes leading to serious or even fatal infection — and interferes with the ability of the pancreas to secrete key enzymes that help the body break down and absorb food. The condition affects about 30,000 people in the United States, and about 70,000 worldwide.
And while treatment of the disease has advanced, permitting people to live much longer and fuller lives, cystic fibrosis manifests itself differently in different patients, complicating treatment and posing a puzzle for biomedical science.
"The disease presents itself differently in individual patients and we don't know why," explains Farrell, an emeritus professor of pediatrics and population health sciences in the University of Wisconsin School of Medicine and Public Health. "It could be something intrinsic, genetic factors or gender, or it could be diet, or something in the environment like respiratory pathogens."
In an effort to tease out some of the mystery, Farrell and his colleagues have embarked on an ambitious project to decipher the genetic nuances of cystic fibrosis. An understanding of the molecular mechanisms at play in cystic fibrosis promises improved treatment for patients who just 50 years ago rarely survived childhood.
Farrell's project depends on blood drawn from nearly 300 cystic fibrosis patients, many of them from Wisconsin's pioneering Newborn Screening Program at the State Laboratory of Hygiene. Extracting DNA from the blood, a process honed by Farrell's colleague Mei Baker, his team will have access to the entire genomes of a broad cross section of patients. By exploring those genomes, Farrell's group can not only detail the gene implicated in the disease, but also comb the roughly 25,000 genes that make up the human genome to see if other genes or mutations are at work.
The ability to sequence and analyze such a mass of genomic information is relatively new to science, and ambitious projects like Farrell's are beginning to emerge at institutions worldwide. But Farrell has an edge. In July 2012, the UW-Madison Bioinformatics Resource Center opened for business, providing one-stop shopping for genetic sequencing, genome assembly, analysis and a host of services to help UW-Madison faculty and others make sense of the sea of data generated by new technologies that have put the secrets of human, plant, animal and microbial genomes within tantalizing reach.
"The most significant thing I've learned is you get a sample of blood, extract DNA and process it with genomic analysis, and that is just barely getting you to first base," says Farrell, whose group is one of dozens from around campus taking advantage of the Bioinformatics Resource Center. "The bioinformatics is really critical."
Situated in the UW-Madison Biotechnology Center, directed by biochemistry professor Michael R. Sussman, the Bioinformatics Resource Center (BRC) is a collaboration that also includes the Waisman Center and the State Laboratory of Hygiene. The director of the BRC is Xiao-yu Liu, recently recruited from the Mayo Clinic.
"This is a key resource for campus," notes Charles Konsitzke, an assistant director of the Biotechnology Center who was instrumental in establishing the new resource for campus. "The data sets are so large that people don't know where to begin to mine that data."
Projects utilizing the BRC, explains Konsitzke, range from analyzing small single data sets to large project collaborations such as the cystic fibrosis initiative directed by Farrell. Other studies range from yeast and viral genetics to the analysis of ancient DNA from archaeological samples.
To illustrate the complexity, Liu points to the human genome. "It has two copies of 23 chromosomes, consisting of 6 billion bases or genetic letters. 'War and Peace' by Leo Tolstoy has 3 million characters and prints in about 1,500 pages. If the human genome is written down in a book, it is roughly two thousand volumes of 'War and Peace.'"
In the case of a project like Farrell's, the collaboration is essential not only because of the mass of accumulating data, but organizing and making sense of the information that can be extracted from it. "In a sense, the bioinformatics is the bridge between the lab analysis and the discovery of knowledge to help you understand the disease," says Farrell. "We're looking at the whole genomes of almost 300 people. If we can find unknown genetic modifiers, we might be able to improve on a treatment strategy for patients that is now one size fits all."