Back to Explorer
Research PaperResearchia:202601.12558857[Artificial Intelligence > Artificial Intelligence]

Reasoning Models Will Blatantly Lie About Their Reasoning

William Walden

Abstract

It has been shown that Large Reasoning Models (LRMs) may not say what they think: they do not always volunteer information about how certain parts of the input influence their reasoning. But it is one thing for a model to omit such information and another, worse thing to lie about it. Here, we extend the work of Chen et al. (2025) to show that LRMs will do just this: they will flatly deny relying on hints provided in the prompt in answering multiple choice questions -- even when directly asked to reflect on unusual (i.e. hinted) prompt content, even when allowed to use hints, and even though experiments show them to be using the hints. Our results thus have discouraging implications for CoT monitoring and interpretability.

Submission:1/12/2026
Comments:0 comments
Subjects:Artificial Intelligence; Artificial Intelligence
Original Source:
Was this helpful?

Discussion (0)

Please sign in to join the discussion.

No comments yet. Be the first to share your thoughts!

Reasoning Models Will Blatantly Lie About Their Reasoning | Researchia