ExplorerArtificial IntelligenceAI
Research PaperResearchia:202605.22012

Reducing Political Manipulation with Consistency Training

Long Phan

Abstract

Large language models (LLMs) exhibit systematic political bias across a variety of sensitive contexts. We find that LLMs handle counterpart topics from opposing political sides asymmetrically. We refer to this phenomenon as covert political bias and identify 7 categories of techniques through which it operates. We propose two metrics for covert bias: Sentiment Consistency measures symmetry in rhetoric and framing across paired political prompts; Helpfulness Consistency measures symmetric depth a...

Submitted: May 22, 2026Subjects: AI; Artificial Intelligence

Description / Details

Large language models (LLMs) exhibit systematic political bias across a variety of sensitive contexts. We find that LLMs handle counterpart topics from opposing political sides asymmetrically. We refer to this phenomenon as covert political bias and identify 7 categories of techniques through which it operates. We propose two metrics for covert bias: Sentiment Consistency measures symmetry in rhetoric and framing across paired political prompts; Helpfulness Consistency measures symmetric depth and engagement. To reduce both types of covert bias, we introduce Political Consistency Training (PCT), an RL training method with two complementary paradigms: Sentiment Consistency Training and Helpfulness Consistency Training. We show that PCT preserves overall helpfulness, substantially reduces covert political bias, and generalizes to held-out benchmarks. We release our work at https://political-manipulation.ai


Source: arXiv:2605.22771v1 - http://arxiv.org/abs/2605.22771v1 PDF: https://arxiv.org/pdf/2605.22771v1 Original Link: http://arxiv.org/abs/2605.22771v1

Please sign in to join the discussion.

No comments yet. Be the first to share your thoughts!

Access Paper
View Source PDF
Submission Info
Date:
May 22, 2026
Topic:
Artificial Intelligence
Area:
AI
Comments:
0
Bookmark