Large language models can effectively convince people to believe conspiracies
abstract
Large language models have been shown to be persuasive across a variety of contexts, but it remains unclear whether this persuasive power advantages truth over falsehood. In three pre-registered experiments (N = 2,724 Americans), participants discussed a conspiracy theory they were uncertain about with GPT-4o, which was instructed to either argue against ("debunking") or for ("bunking") that conspiracy. The jailbroken variant was as effective at increasing conspiracy belief as decreasing it. Concerningly, the bunking AI was rated more positively and increased trust in AI more than the debunking AI. Standard GPT-4o with guardrails produced very similar effects.