Statisticians fret over misuse of data-crunching software

The rise of complex data-crunching techniques and software has put powerful statistical approaches into the hands of more scientists. While this has democratized the ability to probe information for insights, it has also increased the risk scientists will misuse statistics and find false positives. Statisticians are worried.

Columbia University assistant professor Victoria Stodden is among the statisticians to speak out about the risks. The Economist spoke to Stodden as part of its in-depth look at the flaws and failings of the scientific process that threaten the reliability of research underpinning biopharma and other industries. As Stodden sees it, some of the problems stem from the evolution of data-crunching software outpacing scientists' understanding of statistics. The result is the misuse of statistical methods.

In some cases scientists use an inappropriate method because it is the one they know best, or because it is new. Other researchers simply use the methods that come with the software, without understanding when an approach is appropriate or why. The weekly newspaper reports that such failings are widespread and--when considered alongside other failings in research and peer review--raise doubts about the reliability of scientific papers. These papers form the first step in the drug development pipeline.

Amgen ($AMGN) has already questioned the reproducibility of cancer research papers. Writing in Nature last year, researchers from the big biotech reported on their efforts to replicate 53 landmark papers in the basic science of cancer. Despite collaborating closely with many of the original authors, the Amgen team reproduced the results of just six of the papers.

