A study that questioned the reliability of scientific research made huge waves last year, but its key finding was likely overblown because of numerous mistakes in methodology, scientists reported Thursday.
The initial study, published in August in the peer-reviewed journal Science, attempted to replicate 100 previously published studies, and found success just 39 percent of the time.
The results -- named as the third biggest story of the year by Science magazine in its "Breakthroughs of the Year" edition -- "led to changes in policy at many scientific journals, changes in priorities at funding agencies, and it seriously undermined public perceptions of psychology," said researcher Daniel Gilbert, a professor of psychology at Harvard University.
But a new look at the methods of that study suggests it was riddled with errors and may have overestimated the failure-to-replicate rate.
"Readers surely assumed that if a group of scientists did a hundred replications, then they must have used the same methods to study the same populations," said Gilbert.
"In this case, that assumption would be quite wrong."
In some cases, the consortium of 270 scientists known as the Open Science Collaboration (OSC), tried to replicate a study in a different geographic location.
In some cases, this set up the repeat experiment for failure.
- A 5,000-mile testing gap -
One such study attempted to redo an experiment involving racial attitudes at a prominent California University, but using Dutch students who did not have the same cultural attitudes or experiences with the US policy at the center of the experiment, known as affirmative action, aimed at boosting the access of minority groups to higher education.
"They had Dutch students watch a video of Stanford students, speaking in English, about affirmative action policies at a university more than 5,000 miles away," said Gilbert.
It didn't work. But even more troubling, said Gilbert, the research team anticipated that their replication would not work, so they tried it at a US university too. That one did work, but they only included the negative finding in their final analysis, thereby distorting their takeaway message.
"The failure of the replication studies to match the original studies was a failure of the replications, not of the originals," said Gilbert.
Other problems included allowing scientists to choose which experiments they would attempt to repeat, possibly introducing bias to the results.
"All the rules about sampling and calculating error and keeping experimenters blind to the hypothesis -- all of those rules must apply whether you are studying people or studying the replicability of a science," said co-author Gary King, professor at Harvard University.
The Harvard team stopped short of suggesting any intentional wrongdoing by the initial team.
"No one involved in this study was trying to deceive anyone," said Gilbert. "They just made mistakes, as scientists sometimes do."
Indeed, the original team, led by Brian Nosek of the University of Virginia, cooperated with the Harvard team's investigation, Gilbert said.
Nosek wrote an accompanying article in the current issue of Science, in which he agreed with some parts of the critique -- including that "differences between laboratories and sample populations reduce reproducibility."
But he did not entirely back away from his team's findings.
The 2015 study "provides initial, not definitive, evidence -— just like the original studies it replicated," Nosek wrote.