First it was chess. Then it was Jeopardy.
Now computers are at it again, but this time they are trying to automate the scientific process itself.
An interdisciplinary team of scientists at Vanderbilt University, Cornell University and CFD Research Corporation, Inc., has taken a major step toward this goal by demonstrating that a computer can analyze raw experimental data from a biological system and derive the basic mathematical equations that describe the way the system operates. According to the researchers, it is one of the most complex scientific modeling problems that a computer has solved completely from scratch.
The paper that describes this accomplishment is published in the October issue of the journal Physical Biology and is currently available online.
The work was a collaboration between John P. Wikswo, the Gordon A. Cain University Professor at Vanderbilt, Michael Schmidt and Hod Lipson at the Creative Machines Lab at Cornell University and Jerry Jenkins and Ravishankar Vallabhajosyula at CFDRC in Huntsville, Ala.
The "brains" of the system, which Wikswo has christened the Automated Biology Explorer (ABE), is a unique piece of software called Eureqa developed at Cornell and released in 2009. Schmidt and Lipson originally created Eureqa to design robots without going through the normal trial and error stage that is both slow and expensive. After it succeeded, they realized it could also be applied to solving science problems.
One of Eureqa's initial achievements was identifying the basic laws of motion by analyzing the motion of a double pendulum. What took Sir Isaac Newton years to discover, Eureqa did in a few hours when running on a personal computer.
In 2006, Wikswo heard Lipson lecture about his research. "I had a 'eureka moment' of my own when I realized the system Hod had developed could be used to solve biological problems and even control them," Wikswo said. So he started talking to Lipson immediately after the lecture and they began a collaboration to adapt Eureqa to analyze biological problems.
"Biology is the area where the gap between theory and data is growing the most rapidly," said Lipson. "So it is the area in greatest need of automation."
Software passes test
The biological system that the researchers used to test ABE is glycolysis, the primary process that produces energy in a living cell. Specifically, they focused on the manner in which yeast cells control fluctuations in the chemical compounds produced by the process.
The researchers chose this specific system, called glycolytic oscillations, to perform a virtual test of the software because it is one of the most extensively studied biological control systems. Jenkins and Vallabhajosyula used one of the process' detailed mathematical models to generate a data set corresponding to the measurements a scientist would make under various conditions. To increase the realism of the test, the researchers salted the data with a 10 percent random error. When they fed the data into Eureqa, it derived a series of equations that were nearly identical to the known equations.
"What's really amazing is that it produced these equations a priori," said Vallabhajosyula. "The only thing the software knew in advance was addition, subtraction, multiplication and division."
The ability to generate mathematical equations from scratch is what sets ABE apart from Adam, the robot scientist developed by Ross King and his colleagues at the University of Wales at Aberystwyth. Adam runs yeast genetics experiments and made international headlines two years ago by making a novel scientific discovery without direct human input. King fed Adam with a model of yeast metabolism and a database of genes and proteins involved in metabolism in other species. He also linked the computer to a remote-controlled genetics laboratory. This allowed the computer to generate hypotheses, then design and conduct actual experiments to test them.
"It's a classic paper," Wikswo said.
In order to give ABE the ability to run experiments like Adam, Wikswo's group is currently developing "laboratory-on-a-chip" technology that can be controlled by Eureqa. This will allow ABE to design and perform a wide variety of basic biology experiments. Their initial effort is focused on developing a microfluidics device that can test cell metabolism.
"Generally, the way that scientists design experiments is to vary one factor at a time while keeping the other factors constant, but, in many cases, the most effective way to test a biological system may be to tweak a large number of different factors at the same time and see what happens. ABE will let us do that," Wikswo said.
The project was funded by grants from the National Science Foundation, National Institute on Drug Abuse, the Defense Threat Reduction Agency and the National Academies Keck Futures Initiative.