Linguamatics and ChemAxon Announce Project to Enhance Text Mining in Chemistry

December 14, 2011 05:00 AM Eastern Time
Linguamatics and ChemAxon Announce Project to Enhance Text Mining in Chemistry

CAMBRIDGE, England & BUDAPEST, Hungary & BOSTON--(BUSINESS WIRE)--Linguamatics and ChemAxon are pleased to announce that they are partnering in a new, path-breaking project funded by EUREKA's Eurostars Programme. The project is code-named "ChiKEL", which means Chemically Informed Knowledge Extraction from Literature. ChiKEL will provide the first interactive text mining system designed for chemistry, integrating advanced chemical search and extraction of relationships between structures and other biological or chemical entities. By combining chemical search and text mining, users will be able to perform chemical structure and biological searches to extract structured information for further analysis from patents, scientific articles, and internal documents.

"which chemicals are mentioned as inhibitors of a particular target"

This fully automated approach enables chemical structures to be found in documents where mark-up by hand has either not been done, done for some structures but not all, or is uneconomic, e.g., for a company's internal reports. Importantly, the new approach is highly scalable and can find chemical structures at particular points within a document, so questions can be posed such as "which chemicals are mentioned as inhibitors of a particular target" or "what role does the chemical have within this document".

The existing integration between Linguamatics' and ChemAxon's software products enables substructure and similarity searching for known compounds. For example, it is possible to interrogate the literature to find properties of compounds that have a particular substructure, such as the targets that a set of compounds inhibit.

The ChiKEL project extends the existing integration to enable recognition of novel chemical compounds expressed in a variety of ways, including IUPAC names, and images. In addition to substructure or similarity searching according to a given structure drawn by a user, ChiKEL will also enhance the presentation of the results of searches so that users can view chemical structures and browse through clusters of structures found within the documents.

Key aims of ChiKEL are to 1) develop gold standards for evaluation, 2) integrate name to structure to find novel chemicals , 3) structure visualization for search results and 4) explore image to structure conversion.

Applications include: scientific research, intellectual property and commercial intelligence. Specific areas include drug discovery, drug licensing and repurposing, drug safety and pharmacovigilance. Target customers include pharmaceutical and biotechnology companies and adjacent markets such as food, agrochemicals, and healthcare.

Companies interested in being a beta tester for the ChiKEL software should contact Linguamatics at [email protected]

About Linguamatics

Linguamatics is the world leader in deploying innovative "natural language processing" (NLP) based text mining technology for complex, high value problem solving. The Linguamatics approach enables organizations to maximize value from their information resources, synthesizing and distilling meaning from massive amounts of documents into meaningful results, to support decision making. Linguamatics' agile, scalable, I2E software platform is used by nine of the world's Top-10 pharmaceutical companies, and many other prestigious commercial, academic and government organizations. This impressive customer base has been built on producing break-through insights from massive amounts of unstructured, textual data, which contains rich but often difficult-to-access, business-critical information. As a result, the company has been self-financing and profitable since its conception in 2001, growing at an average rate of around 50% per annum, with a software license renewal rate that regularly exceeds 95% per year. Three of the original four founders still sit on the management team.

Linguamatics partners and collaborates with commercial, academic and governmental organizations to bring customers the right solution for their needs, and to develop next generation capabilities. Current partners include Oracle, Accelrys and ChemAxon. The company has also received project research funding from the European Union and won a number of awards. Linguamatics operates globally, and has offices in Cambridge, UK, and Boston, MA, USA. For further information, visit

About ChemAxon

ChemAxon is a leader in providing cheminformatics software development platforms and desktop applications for the biotechnology, pharmaceutical and agrochemical industries. With core capabilities for structure visualization, search and management, property prediction, virtual synthesis, screening and drug design, ChemAxon focuses upon active interaction with users and software portability to create powerful, cost effective cross platform solutions and programming interfaces to power modern cheminformatics and chemical communication. The company is privately owned with European headquarters in Budapest and sales and support offices in Europe, Japan and North America.

Linguamatics Limited
Sue Ziobro, Head of Marketing
+44 1223 421 360