Industry Voices: Inside Genomics--Q&A with Bina CEO Narges Bani Asadi and Sam Volchenboum

Narges Bani Asadi, founder and CEO of Bina Technologies

By Narges Bani Asadi

One of the elements lacking in the personalized medicine discussion today is the perspective of clinicians and informaticists working in the field. To remedy the gap, I've asked a series of leaders in the industry to offer up their views.

Dr. Samuel Volchenboum is the director of informatics at the University of Chicago. He is an expert in pediatric cancers and blood disorders, with a special interest in treating children with neuroblastoma. He is also pursuing research projects on the cutting edge of bioinformatics, using advanced computer science to solve some of the most complex problems in biology.

We recently sat down to discuss the current state of genomics. Here is our conversation:

What, in your mind, is the single most prohibitive problem that's keeping us from getting to personalized medicine today? What keeps you awake at night?

Dr. Samuel Volchenboum, director of informatics, University of Chicago

The standards for privacy and data sharing have been developed as a result of many violations of trust over the past 100 years. While many of these safeguards serve an important function, it remains very difficult, often impossible, to conduct basic and clinical research using patient data.

The tools exist to perform widespread, in-depth, accurate, and precise analysis of a person's genome, exome, and even proteome/metabolome/biome and other -omes. There is no shortage of samples already in biobanks around the world. These blood, urine, tumor, bone marrow and other specimens have been collected extensively for years and stored in tissue banks at major medical centers and academic institutions.

In most cases, though, two important features are missing.

First, a general consent to use the material for any and all future studies. When they are collected, most people are consenting to a limited use of their sample, and that use is usually quite restrictive in how a study can be conducted. Second, a startling lack of connections kept between the sample data and the clinical information remains. In other words, it may be possible to anonymously study the mutations or characteristics of DNA, RNA, proteins, etc., over many samples, but without a connection back to clinical data, many of these data are useless.

Most of the time, this lack of connection is a result of an inefficient and poorly designed infrastructure. Other times, it is a natural consequence of an ill-conceived consent process. What the research community is often left with is a trove of incredible data that cannot be connected back to the useful phenotypic information that might best lead to helpful insights.

Progress is being made, of course, and despite a rigid and sometimes draconian set of rules and regulations, research marches on steadily. But should legislation or other regulation allow more unfettered and untethered access to clinical samples and associated patient data, the pace of progress would certainly improve dramatically.

Strict regulations must be in place to protect patients and their rights, and these rules must exist to prevent the kinds of abuse that ran rampant during the 19th and 20th centuries. Nevertheless, it is the patients themselves that clamor for better research and are often the most vocal about the inefficiencies and frustrations around overly paternalistic data stewardship policies.

There needs to be room for compromise and change, and as a result, there will surely be accelerated access to exciting and potentially transformative research findings.

We talk a lot about whole genome analysis, but what about the exome? With the latter, there's less data, and lower costs, covering up to 80% of mutations, statistically. Will the standard of care be to sequence and analyze the exome first, and then move upstream on an as-needed basis? How, in your mind, can we use each to their greatest advantage?

Of course, the area of most intense interest has been the transcribed areas of the genome. The DNA message that is directly translated into active proteins is most likely going to explain the observed clinical phenotype. Nevertheless, concentrating on this limited and tiny proportion of the overall DNA content is going to leave many questions unanswered. Alternative splicing and its control is responsible for many diverse functions, and mutations affecting this process may not be readily evident through exome sequencing.

Upstream untranslated regions are important for translational control, and mutations affecting these regions may have important impacts on protein production. Most interesting are the yet-undescribed "dark matter" functions in the remaining 99% of the DNA. These noncoding regions are unlikely to be completely superfluous, and many diseases and alterations to health will ultimately be traced back to modulations in these areas.

To surmise that 80% of disease-causing mutations are being captured by exome sequencing assumes that we even understand the whole of what we are observing. Mutational analysis is just the beginning. The complex interplay between proteins and other elements such as microRNAs affect both the healthy and disease states. We are in a transition period, during which we have a limited ability to analyze and interpret the enormous stream of data being collected.

Over the coming years, as data collection and sample analysis becomes even cheaper and results interpretation becomes more streamlined, there will be a shift to a standard of whole genome sequencing. Hopefully the pace of discovery will keep up with the falling price of the technology, lest we end up with huge amounts of minimally interpreted data.

If you could do it all over again, what would you have done differently in your career and/or research to date, so as to have a bigger impact today?

I currently enjoy a unique position as a physician, biologic researcher, and computer scientist. This gives me insights and increased access to many people and technologies not available to those who restrict themselves to a single field. Yet the path to this state has been inefficient and suboptimal.

Many years were spent concentrating on molecular biology or enzymatics, while I spent no time developing computer skills or keeping up with emerging technologies. Had I a greater appreciation for the coming onslaught of big data and the importance of computer scientists in managing and analyzing this information, I would have devoted time earlier and more extensively to bolstering my skills in computer programming, statistics, and data analysis.

The years in medical school and graduate school were important, of course, and they have contributed immensely to my current position. Yet the time during which little or no attention was paid to my computer science or math training could have been better spent. Hindsight being perfect, there is no doubt in my mind that I would be more successful and productive had I decided early on to specialize in big data analytics as applied to biological problems. Students today have much more efficient pathways to this state, a prime example being an M.D./Ph.D. program where the graduate work is performed in bioinformatics or computer science.

This process is going to train the next generation of data analytics stars, and it is this group that is going to be responsible for mining the enormous and almost intractable amount of data being generated.

We've been talking about how genomics will revolutionize patient outcomes for a long time. How close are we really to a cure for cancer? Very close? Not close at all?

A "cure for cancer" will not come in the form of a magic bullet or a single discovery. Cancer will be prevented and sometimes cured through methodical advances in diagnosis, stratification, and treatment. Genomic techniques are helping to pinpoint the causes of cancer and the pathways perturbed that lead to malignancy.

In addition, genomic characterization of tumors has facilitated a better stratification and classification of patients, allowing a more customized approach to therapy. In certain circumstances, a specific mutation or translocation is identified and leads to the development of a treatment targeted to that specific change. In these cases (imatinib for chronic myelogenous leukemia being the index case) a dramatic improvement in outcomes can be observed. But in general, the changes will be incremental.

Over the next 20 years, most diseases will be redefined in terms of modulations in biological pathways and alterations in the kinds and amounts of proteins and their interactions. In parallel, this change in the paradigm of disease classification will see the development of an armamentarium of customized therapies designed to specifically prevent or treat that condition in a particular host. The changes will be slow, but ultimately dramatic and transformative.

There is a certain amount of public concern about the security of personal genetic information. What is being done today and what can be done in the future to ensure the privacy of that information?

Much of the concern for genomic privacy is misplaced and unproductive. Genomic data needs to be protected in the same way that any private information is kept safe. A person's genome should no more be made public than their social security number or shopping habits.

The source of much consternation over genomic "privacy" relates back to the consent process. When a patient consents to have their genomic data used for testing, there should be no illusion that their data are nonidentifiable. It may be "deidentified," but there is really no such thing as nonidentifiable.

Someone with the proper motivation and maliciousness has a good chance of eventually being able to identify someone from their genomic data. In addition, people must understand that once their data have been released into the "wild," they are nonretrievable. In other words, it will forever remain public.

If the consent process informs people of these two issues--that their data may be identifiable and that its release cannot be retracted--then much of the issue has become moot. Where there needs to be a significant amount of work is around legislation to protect people. There are obvious issues, such as using genomic testing results to limit insurance for a patient or even a family member. But there are many other concerns.

What if someone leaves another person's DNA at a crime scene to implicate them? What if a genetic defect is found in a family, and other relatives did not want to know about the potential problems they might have? What if someone has genomic testing for one problem but finds out they have another? What if an employer or insurance company refuses someone because a relative has a genetic predisposition to a serious disease? These issues need to be solved through legislation and enforcement.

Dr. Narges Bani Asadi works at the intersection of life sciences and high-performance computing. Narges is the founder and CEO of Bina Technologies, a company that analyzes and interprets whole human "omes" quickly and at scale.