Large language models can effectively convince people to believe conspiracies

January 9, 2026

Jean-François Godbout

Adam Gleave

David Rand

Gordon Pennycook

abstract

Large language models have been shown to be persuasive across a variety of contexts, but it remains unclear whether this persuasive power advantages truth over falsehood. In three pre-registered experiments (N = 2,724 Americans), participants discussed a conspiracy theory they were uncertain about with GPT-4o, which was instructed to either argue against ("debunking") or for ("bunking") that conspiracy. The jailbroken variant was as effective at increasing conspiracy belief as decreasing it. Concerningly, the bunking AI was rated more positively and increased trust in AI more than the debunking AI. Standard GPT-4o with guardrails produced very similar effects.

Research

Our research explores a portfolio
of high-potential agendas.

Events

Our events bring together
global leaders in AI.

Programs

Our programs build the field of trustworthy and secure AI