Grid-powered software analyzes patents

Simultaneous analyses of text and image data are helping researchers probe the intersections between biology and chemistry--so says the Fraunhofer Institute in Germany. The institute is working with the Jülich Supercomputing Centre on automated annotation software for grid-connected supercomputers, which are being used to query some 50,000 pharmaceutical chemistry patents.

The partners have processed the patents on the large-scale computing grid infrastructures at the two institutions. Automated "named entity recognition" services have identified and annotated biological entities in text (e.g., protein names, gene names, cell types); medical entities in text (including disease names); chemical information in text, including drug names; and images, including chemical structure depictions.

UNICORE middleware helps manage the annotation services in the grid infrastructure; control the streams of input and output data from the patents database to the annotation services; and monitor the overall progress.

Text-mining applications have so far been run only on bibliographic databases of life sciences and biomedical information, according to an announcement. Simultaneous text/image analyses in full-text documents on grid infrastructures represent a next step in computing. "This goes way beyond the usual simulation applications" used in scientific computing, says Martin Hofmann-Apitius, department head for bioinformatics at Fraunhofer.

- here's the Fraunhofer announcement

Suggested Articles

There's no evidence personal patient information leaked during the 11-week breach, but the same can't be said about Sangamo's own secrets.

Through a new online tracker, AllTrials names sponsors who fail to report clinical trial results on time per the FDAAA Final Rule.

The new solution aims to streamline the incorporation of human genomic data into clinical trial designs.