Teams of Rivals

Adversarial collaborations offer a rigorous way to resolve opposing scientific findings, inform key sociopolitical issues, and help repair trust in science.

Policy Psychology Rationalism Social Science Statistics

Current Issue

This Article From Issue

November-December 2025

Volume 113, Number 6
Page 336

DOI: 10.1511/2025.113.6.336

In a personal history published in American Psychologist in 2003, economics Nobel laureate Daniel Kahneman lamented the needless acrimony and counterproductivity of scientific disagreements hashed out without structure or standards.

“I am convinced that the time I spent on a few occasions in reply–rejoinder exercises would have been better spent doing something else,” he wrote. “Both as a participant and as a reader, I have been appalled by the absurdly competitive and adversarial nature of these exchanges, in which hardly anyone ever admits an error or acknowledges learning anything from the other. Doing angry science is a demeaning experience—I have always felt diminished by the sense of losing my objectivity when in point-scoring mode.”

QUICK TAKE
  • Contradictory findings persist in social sciences, undermining validity, reliability, and public trust in research. Practices such as open science cannot fully address this problem.
  • Entrenched positions result from a system that rewards novelty, supports fiefdoms, and disincentivizes publishing null results and reproducibility studies.
  • Adversarial collaborations offer a neutral framework for resolving disputes and mutually publishing findings, but they require wider adoption to be effective.

Kahneman advocated for a better way: adversarial collaborations—structured efforts in which scientists who disagree on a theory, finding, or interpretation work together to resolve their disagreements. “My hope is that . . . adversarial collaboration may eventually become standard. This is not a mere fantasy: It would be easy for journal editors to require critics of the published work of others—and the targets of such critiques—to make a good-faith effort to explore differences constructively. I believe that the establishment of such procedures would contribute to an enterprise that more closely approximates the ideal of science as a cumulative social product.”

Traditionally, disagreeing scholars run independent research programs, writing papers, critiques, and rejoinders back and forth, aiming to persuade the scientific community that their side is correct. The process is often open-ended and inconclusive. We believe that adversarial collaborations, if widely adopted, could lead to more productive outcomes.

Yuki Murayama

Neutral Ground

As in any effective debate, successful adversarial collaborations require setting clear ground rules. Scholars must agree on which positions to argue and on the values, criteria, and evidence by which they will judge which position “wins.” They must agree that confirmation of their opponents’ hypothesis casts doubt on their own and mutually publish the results. Some adversarial collaborations comprise only the sparring sides, whereas others use a neutral referee agreed upon by the adversaries or appointed by an outside body such as a journal’s editorial board.

One successful example of such a collaboration, reported in 2023 in Proceedings of the National Academy of Sciences of the U.S.A. (PNAS), concerned the relationship between wealth and happiness. In it, Matthew A. Killingsworth of the University of Pennsylvania and Kahneman, a professor at Princeton University, resolved much of their long-standing debate over a possible income threshold governing the relationship between wealth and emotional well-being. Barbara Mellers, also of the University of Pennsylvania, served as a neutral referee, contributing to the adversarial collaboration design and to cowriting the resulting paper.

Ad Left

In 2010, Kahneman had reported in a PNAS paper that happiness increased with certain income levels, but that the relationship plateaued among incomes somewhere between $60,000 and $90,000. Conversely, in a 2021 paper, also in PNAS, Killingsworth had found a consistent and unabated rise in happiness as wealth increased. Their adversarial collaboration confirmed Kahneman’s flattening pattern but found that it affected only the least happy 20 percent of the population. As it turned out, the disagreement arose from the authors’ use of certain standard practices and data analysis assumptions. Thus, the collaboration not only addressed the key disagreement, it also amplified a methodological caveat: Although certain statistical practices and assumptions are standard, social scientists must use them cautiously and rigorously.

It’s clear to us that adversarial collaborations have much to offer science and exemplify the spirit of honest scientific inquiry. Yet more than two decades after Kahneman coined the term “adversarial collaboration” and hoped these procedures would become standard, the practice remains rare, and its promise is still largely theoretical.

There are many legitimate reasons for the dearth of adversarial collaborations. Participating requires that scientists relinquish some of their autonomy to accommodate the preferences of others. These collaborations also require navigating the logistics of choosing team members, agreeing to schedules, and negotiating a host of methodological disagreements with adversaries.

Yet, as contradictory claims accumulate, the need for adversarial collaborations grows deeper. This necessity is especially pronounced in many social sciences, where we see divergent findings coexist and receive empirical support over very long periods, sometimes over entire careers. One team will publish a finding that another team counters with a contrary finding, and this back-and-forth can go on for decades. Or, worse, the two contrary lines of scholarship occur in parallel, but the advocates for one side only rarely grapple with the issues raised by the other. We believe this pattern is partly due to misaligned incentive structures in science that reward novelty, which often requires contradicting prior work. We believe adversarial collaborations can also overcome these impasses.

The alternatives leave much to be desired. Occasionally, a resolution of contradictory claims results from the retirement or death of a key proponent rather than stemming from a deep resolution among scientists. As German physicist Max Planck wrote in his 1949 work, Scientific Autobiography and Other Papers, “A new scientific truth does not triumph by convincing its opponents and making them see the light, but rather because its opponents eventually die, and a new generation grows up that is familiar with it.” This is unquestionably an embarrassing situation for science. Must someone really retire or die for scientists to relinquish our cherished theories?

Reliability and Validity

Using adversarial collaborations, we can do better—not only by settling disputes more scientifically than, as Planck is sometimes paraphrased, “one funeral at a time,” but also through the selection of tests by which we settle disputes. We speculate that teams comprising divergent scholars are likelier to use philosopher Deborah Mayo’s concept of strong inference, defined in her 2018 book Statistical Inference as Severe Testing, and severe testing, Mayo’s term for the powerful hypothesis-testing methodology described by the late physicist John R. Platt in his seminal 1964 Science paper, “Strong Inference.” Strong inference refers to setting up a test between mutually exclusive predictions and publishing the findings regardless of the results, rather than what we fear has become the norm: seeking confirmatory evidence and “file drawering” null results (again, this state of affairs is not entirely scientists’ fault; null results are notoriously difficult to publish). Beyond the rare “existence proof” (for example, how finding a single black swan disproves the claim that all swans are white), no single social science study can conclusively refute any particular theory; scientific claims frequently involve boundary conditions and zones of uncertainty. Nonetheless, we see value in comparing predictions made by competing theories to determine which one has a more successful track record.

Teams of rivals were able to greatly narrow their differences because of discussions with opposing team members.

Severe testing is the related idea that the scientific community ought to accept a claim only after it surmounts rigorous tests designed to find its flaws, rather than tests optimally designed for confirmation. The strong motivation each side’s members will feel to severely test the other side’s predictions should inspire greater confidence in the collaboration’s eventual conclusions. If such procedures became mainstream, scholars might adopt more rigorous standards at the outset to preempt such challenges.

Some readers may wonder if the more recent and popular practices called open science, which include openly sharing data and materials and publicly preregistering research plans before collecting data, might not address some of these concerns. Preregistration refers to the practice of creating a written, publicly available document that lays out a study’s hypotheses, methods, and planned analyses, and that discusses how scientists will interpret results with respect to (dis)confirmation of these hypotheses. We strongly support open science practices. They help improve science’s reliability—our confidence that, when we use the same methods again, we will obtain similar results. Moreover, transparency reduces opportunities for the kinds of scholarly wiggle room (dropping cases or variables, for example) that previously enabled the reporting of statistically significant effects from null datasets.

But adversarial collaborations address a different, possibly more serious problem than reliability: validity. We know that science suffers from a validity crisis because countless scientific papers directly contradict each other. Under the rules of Aristotelian logic and Popperian falsifiability that characterize most modern science, such contradictions mean that many scientific conclusions, replicable or not, must be incorrect. Adversarial collaborations can help address this problem by letting scientists merge perspectives over time and identify critical boundary conditions that lead to more unified understandings.

Yuki Murayama

Open science, moreover, lacks safeguards against bias. Good scientists strive to avoid bias, but as cognitive and social psychologies remind us, such prejudices can occur unconsciously and persist unrecognized. Bias can result from the typical lack of viewpoint diversity among collaborators because like-mindedness often prevents team members from considering key issues. Importantly, bias resulting from such viewpoint “bubbles” can occur even among team members who use the most common and well-known open science practices—such as preregistering hypotheses, methods, power analyses, and statistical procedures, and making raw data publicly available—because such initiatives place no constraints on how scientists select methodological procedures. Preregistration does not require that a team include a devil’s advocate who will propose alternative hypotheses, dependent variables, operationalizations of core constructs (how to make concepts testable or measurable), or methodologies. Nor does it demand that someone strive to falsify rather than to confirm.

In short, unlike adversarial collaborations, open science practices do not involve implementing severe tests or selecting team members to disagree over what evidence would disconfirm a theory. Thus, though laudable, these practices do not address biases or other key shortfalls the way that adversarial collaborations can.

Sociopolitical Implications

Beyond resolving long-standing scientific disagreements, we believe that adversarial collaborations could increase both the validity and credibility of scientific conclusions with important sociopolitical implications, such as gender bias in tenure-track hiring. Based on the contents of top science journals such as Nature and Science, findings by blue-ribbon National Academies of Sciences, Engineering, and Medicine commissions, and articles in respected legacy media such as The New York Times, one might assume that gender bias in academia is settled science, and that women are less likely than men to be hired, promoted, published, or rated as competent.

If, as some studies suggest, social psychology has undergone a homogenization of political views—or, in any case, if one accepts the assertion that research is often conducted by teams of people who share the same views regarding the research topic—then it follows that hypotheses that challenge the assertion of gender bias in tenure-track hiring will face an uphill battle before anyone can pose, test, and publish them (assuming they ever do so). If so, then the existence of tenure-track gender bias might be either a valid finding or merely a concrete manifestation of the social sciences’ lack of viewpoint diversity. We do not intend to adjudicate this question here; rather, we seek to shine a light on how research can become ideologically thorny when it lacks a mechanism such as adversarial collaboration to help resolve contradictory conclusions.

Indeed, an adversarial collaboration two of us published with economist Shulamit Kahn of Boston University in Psychological Science in the Public Interest in 2023 attempted to resolve formerly published contrary findings on this very topic. Our results showed that gender bias claims in academic sciences are often incorrect or lack needed qualifications. In our view, representation from different viewpoints can help elucidate this topic and others with sociopolitical implications, such as studies of affirmative action, gender bias, abortion, DEI (diversity, equity, and inclusion) attestations, implicit racism, and immigration.

To take one example, various controversies beset the notion of implicit bias (unconscious or unrecognized prejudice), including how strongly such bias predicts discrimination, whether measures of it differ from explicit prejudice, whether implicit trainings do more harm than good, and whether scores of 0 on the Implicit Association Test (an assessment tool for detecting belief associations and implicit bias) in fact correspond to egalitarian attitudes. Similar controversies surround topics such as the role of racism in policing; microaggressions; the relative prevalence of biases among those on the political right versus the political left; the effectiveness of gender-affirming care; the magnitude and importance of sex differences; the predictive validity of standardized achievement tests; and the validity of various low-cost interventions in academic achievement.

For such politically charged issues, adversarial collaborations might also help repair ongoing declines in scientists’ credibility among the public. Over the past two decades, aside from a small recent uptick, trust in academia has been declining among conservatives. As Morgan Marietta and David C. Barker argue in their 2019 book, One Nation, Two Realities, the more Americans across the political spectrum become aware of the left-leaning skew within academia, the less credible they find claims made by its members. And recent research suggests that the apparent politicization of scientific journals and programs undermines public trust in those institutions and saps the public’s willingness to defer to scientists’ expertise.

We hold that scientists might earn back that trust through the improved validity made possible by adversarial collaborations. That validity stems partly from the improved rigor we expect will result from negotiating methods with contrary team members, and partly from a tendency of adversarial collaborations to limit confirmation biases and reduce leaping to unjustified conclusions.

Consider, for example, an adversarial collaboration concerning the effects of accuracy prompts, which ask users to evaluate the trustworthiness of a source or the correctness of a headline, on the quality of news-sharing decisions (and potentially the spread of misinformation). Published in a Psychological Science paper in 2024, the collaboration addressed a disagreement among researchers regarding whether such prompts work on politically right-leaning audiences—an important question, because research suggests that Americans on the right are more likely to share misinformation. To test the question, the authors used a multiverse meta-analysis, which subjects data to a full range of possible analytical decisions a researcher might make, thereby testing how sensitive that data might be to various analytical choices. The authors found that party membership did vary the effectiveness of accuracy prompts, but that the weakness of the effect among Republicans was not consistent and depended on “operationalizations of ideology/partisanship, exclusion criteria, or treatment type.”

Team members who acknowledge such uncertainties will reach conclusions that are more valid than more blinkered team members who erroneously believe their evidence is compelling. Moreover, as adversaries articulate their competing perspectives, they may find that their disagreements are much smaller than they originally thought. In a 2024 article in American Psychologist, we described several cases in which teams of rivals were able to greatly narrow their differences because of discussions with opposing team members.

We believe that adversarial collaborations can also create synergies with other best practices and approaches, including preregistration. Good science can benefit enormously from this public airing because it minimizes post hoc claims that undermine scientific validity. Consider two notable preregistration variants: registered reports (RRs) and registered replication reports (RRRs). In the former, authors submit a preregistration to a journal or review platform (such as the Peer Community in Registered Reports) for consideration. The journal can offer an in-principle acceptance: If the authors conduct the study and interpret it as described, the journal will likely publish it. Such reports address publication bias because the journal will publish the research whether the results are significant or not, and without regard for any theoretical or political position the findings might refute. RRRs work the same way, except they are proposed replication studies. Both variants work well with adversarial collaborations, which can help ensure their fairness and rigor. For their part, RRs and RRRs can help hold adversarial collaborators to an explicit research plan that minimizes postgame quarreling.

A New Standard

Our research and our experience indicate that the significance of adversarial collaborations extends well beyond any one specific outcome. More important than whether the individual adversaries change their theories, we argue, is the opportunity the entire scientific community will gain to evaluate the findings produced by an adversarial collaboration team—a structured, rigorous review of competing predictions derived from those theories. As the Hungarian philosopher Imre Lakatos and others have argued, scientists view a theory as plausibly debunked not when its most prestigious or aggressive proponents admit to its refutation, but when the community of diverse scholars reaches a consensus that the theory is useless, of extremely limited value, or debunked.

Neither adversarial collaborations nor preregistrations are guaranteed to solve these or any other empirical or theoretical controversies. However, we contend that the synergistic combination of adversarial collaborations and preregistered reports comes as close as the field currently can to such a standard. And, as Nelson Cowan argued in the Journal of Applied Research in Memory and Cognition in 2022, preregistered adversarial collaborations should advance psychological science more effectively, rigorously, and quickly than any known alternative.

Consider a parallel notion common in the tech sector, which has long employed “red team” members in research to play devil’s advocate and to find flaws in computer programming code. Red team members function much like the opposing members of an adversarial collaboration. Traditionally, programmers and mechanical engineers prototype and seek all ways to break their own code and dismantle their creations because they realize that, once released, someone else will most definitely try to defeat them.

Even if they improve psychological science, adversarial collaborations have their limits, many of them human. Any member of any team of theoretical or political adversaries may still retain their own limitations, biases, or blind spots. Thus, whatever the team members produce can and should remain subject to evaluation by conventional scientific standards, including the wider community of scholars and scientists.

Adversarial collaborations can increase both the validity and credibility of scientific conclusions with important sociopolitical implications.

Moreover, convincing opposing sides to join adversarial collaborations can pose a challenge. Each side has much to lose if the findings undermine their past claims, which may form the very foundations of many of their careers. But there is also a powerful incentive for scientists to join adversarial collaborations: If they decline the invitation, someone else with less at stake—someone who may not advocate as diligently for their past claims—may take their place. Of course, researchers who join an adversarial collaboration may actively work to find loopholes, undermine their critics, or even sabotage the process to preserve their favored positions. However, participants can substantially curtail the power of these bad actors by establishing safeguards and protocols up front, such as appointing a neutral referee to adjudicate internal disputes.

With all of these factors in mind, we propose the establishment of an infrastructure within journals or externally that will support and encourage conflicting perspectives, emphasizing adversarial collaborations but also utilizing other mechanisms (such as RRRs) that infuse studies with opposing voices. Particularly when the research question concerns a highly contentious topic or a weighty sociopolitical issue, the members of each adversarial collaboration team should represent the competing views or ideological positions that are most affected. Ideally, funding tracks specific to adversarial collaborations might be established within funding bodies for particularly controversial or high-stakes research questions. Research agencies might implement a funding bonus on top of regular grants for authors who accept the risks of working across the aisle with rivals or opponents. Journals might establish dedicated article tracks. Finally, the infrastructure should include a means of commissioning retrospective meta-studies to determine whether adversarial collaborations do in fact score higher on validity and reliability.

The University of Pennsylvania is home to one such nascent effort: the Adversarial Collaboration Project. Project members are currently carrying out multiple adversarial collaborations, including on the issues of motivated reasoning, implicit bias, and political bias in psychology, and have collaborated with a few academic journals, including Advances in Methods and Practices in Psychological Science, to promote adversarial collaboration research. But the project is still small and new, and adversarial collaborations will require broader buy-in from the entire scholarly community.

We can do better. If research teams included scholars with opposing research agendas, the social sciences would likely see stronger inferences and more severe testing, thereby optimizing methods and producing better science. Adversarial collaborations are an idea whose time has come, and their commissioning could lead to a much-needed cultural shift in science.

Bibliography

  • Ceci, S. J., C. J. Clark, L. Jussim, and W. M. Williams. 2024. Adversarial collaboration: An undervalued approach in behavioral science. American Psychologist. Published online doi:10.1037/amp0001391.
  • Ceci, S. J., S. Kahn, and W. M. Williams. 2023. Exploring gender bias in six key domains of academic science: An adversarial collaboration. Psychological Science in the Public Interest 24:15–73.
  • Clark, C. J., T. Costello, G. Mitchell, and P. E. Tetlock. 2022. Keep your enemies close: Adversarial collaborations will improve behavioral science. Journal of Applied Research in Memory and Cognition 11:1–18.
    • Isch, C., P. E. Tetlock, and C. J. Clark. 2025. Reflections on adversarial collaboration from the adversaries: Was it worth it? Theory and Society. Published online doi: 10.1007/s11186-025-09634-2.
    • Kahneman, D. 2003. Experiences of collaborative research. American Psychologist 58:723–730.
    • Killingsworth, M. A., D. Kahneman, and B. Mellers. 2023. Income and emotional well-being: A conflict resolved. Proceedings of the National Academy of Sciences of the U.S.A. 120:e2208661120.

American Scientist Comments and Discussion

To discuss our articles or comment on them, please share them and tag American Scientist on social media platforms. Here are links to our profiles on Twitter, Facebook, and LinkedIn.

If we re-share your post, we will moderate comments/discussion following our comments policy.