Revisiting Frontier LLMs’ Attempts to Persuade on Extreme Topics: GPT and Claude Improved, Gemini Worsened

February 10, 2026

Matthew Kowal

Jasper Timm

Jean-François Godbout

Thomas Costello

Siao Si Looi

ChengCheng Tan

Gordon Pennycook

David Rand

Adam Gleave

Kellin Pelrine

Abstract

We test recently released models from frontier companies to see whether progress has been made on their willingness to persuade on harmful topics like radicalization and child sexual abuse. We find that OpenAI’s GPT and Anthropic’s Claude models are trending in the right direction, with near zero compliance on extreme topics. But Google’s Gemini 3 Pro complies with almost any persuasion request in our evaluation, without jailbreaking.

Research

Our research explores a portfolio of high-potential agendas.

Events

Our events bring together global leaders in AI.

Programs

Our programs build the field of trustworthy and secure AI