Header

Suche

Large-Scale Study Successfully Replicates About Half of a Sample of Social and Behavioral Science Findings

Prof. Dr. Heinrich Nax co-authors a new study published in Nature, of which implications point to uncertainty—not failure—and a need for more cautious confidence in research claims.

A new study published in Nature, Investigating the replicability of the social and behavioral sciences, reports results from a large-scale effort testing whether published findings can be replicated successfully when tested again with new data. Overall, about half of claims replicated successfully, with replication outcomes showing smaller effect sizes than original outcomes, on average.

The results come from the most systematic, cross-disciplinary replication effort of its kind to date, conducted as part of the Systematizing Confidence in Open Research and Evidence (SCORE) program funded by the Defense Advanced Research Projects Agency (DARPA). A collaboration of 292 researchers from 199 institutions from around the world attempted to replicate findings from 164 published papers drawn from well-known journals across several fields including business, economics, education, political science, psychology, and sociology and published between 2009 and 2018. Replications were designed to be rigorous, well-powered, preregistered, transparent, and went through a peer review process in advance to improve research quality.


Investigating the replicability of the social and behavioural sciences

Brian Nosek (Center for Open Science) et al.

First published online April 1st, 2026 in Nature 

doi.org/10.1038/s41586-025-10078-y

Abstract
Pursuing replicability — independent evidence for previous claims — is important for creating generalizable knowledge. Here we attempted replications of 274 claims of positive results from 164 quantitative papers published from 2009 to 2018 in 54 journals in the social and behavioural sciences. Replications were high powered on average to detect the original effect size (median of 99.6%), used original materials when relevant and available, and were peer reviewed in advance through a standardized internal protocol. Replications showed statistically significant results in the original pattern for 151 of 274 claims (55.1% (95% CI 49.2–60.9%)) and for 80.8 of 164 papers (49.3% (95% CI 43.8–54.7%)) weighed for replicating multiple claims per paper. We observed modest variation in replication rates across disciplines (42.5–63.1%), although some estimates had high uncertainty. The median Pearson’s r effect size was 0.25 (95% CI 0.21–0.27) for original studies and 0.10 (95% CI 0.09–0.13) for replication studies, a 82.4% (95% CI 67.8–88.2%) reduction in shared variance. Thirteen methods for evaluating replication success provided estimates ranging from 28.6% to 74.8% (median of 49.3%). Some decline in effect size and significance is expected based on power to detect original effects and regression to the mean due to replicating only positive results. We observe that challenges for replicability extend across social– behavioural sciences, illustrating the importance of identifying conditions that promote or inhibit replicability.

Instagram LinkedIn Bluesky Mail

Unterseiten

Weiterführende Informationen

About

Prof. Dr. Heinrich Nax is SNF Eccellenza Assistant Professor at the SUZ and a behavioral game theorist. His research interests include (Experimental/Behavioral) Market Design and Learning in Games applied to Markets and Collective Goods.

SCORE

Assessing the credibility of research claims is a central and continuous part of the scientific process. However, current assessment strategies often require substantial time and effort. To accelerate research progress, the Center for Open Science (COS) partnered with the Defense Advanced Research Projects Agency's (DARPA) program Systematizing Confidence in Open Research and Evidence (SCORE) in 2019 on work towards developing and deploying automated tools that provide rapid, scalable, and accurate confidence scores for research claims.