Rare disease dictionary gives bioinformatics databases a shared language

While bioinformatics lies at the leading edge of biological research, it is still underpinned by one of the oldest tools available to humanity: communication. Recognizing this, researchers are trying to standardize terminology through the creation of a rare genetic disease dictionary.

The project stemmed from rising awareness that while lots of researchers are working to identify rare diseases and their related genes, their findings are siloed in multiple, unconnected databases. Worse still, the databases each use different phenotypic terms to describe the same genetic disease. An old case study that refers to "mental retardation" is covering the same territory as a modern database entry for "intellectual disability," but it takes a human eye to spot the link. To a computer, they are two different, unrelated terms.

Humans can spot the links, but the scale of bioinformatics projects makes manual identification impractical. "If you're comparing unsolved exomes and you've got 20,000 variants and all kinds of different phenotypes, you must have some high-level standardization. Computers have a really hard time unless they're speaking exactly the same language," Dr. Ada Hamosh, clinical director of the Institute of Genetic Medicine at Johns Hopkins Medical School, told Bio-IT World. Just as globalization in the 19th Century drove a surge in interest in new, constructed languages to enable international communication, the bioinformatics boom has created a need for shared phenotypic terminology.

The International Consortium for Human Phenotype Terminologies (ICHPT) is the result. Last month, the collaborators behind ICHPT--which include every major organization doing rare genetic disease research--met for a 17-hour session to standardize 2,700 phenotypes that appear twice or more in 6 leading databases. The result is a list of accepted terms Hamosh expects the major databases to have adopted by the start of 2014. With a standardized list in place, rare genetic disease researchers--which include the biopharma companies that have hitched their growth hopes to orphan drugs--will have a better chance of finding patients with the same illnesses.

- read the Bio-IT World feature