ExplorerArtificial IntelligenceAI
Research PaperResearchia:202605.25050

It's the humans, not the data: Geopolitical bias in LLMs originates in post-training, amplified by the language of the prompt

Stuart Bladon

Abstract

It has generally been assumed that geopolitical bias in language models originates from the training data used during the pre-training phase. We tested seven open-weight LLM pairs consisting of the base model (pre-training only) and the chat model (pre-training and post-training) from seven labs on a paired-scenario forced-choice probe over 28 country pairs in English, French, and Chinese, and found that geopolitical bias originates in post-training rather than in pre-training. Across seven AI l...

Submitted: May 25, 2026Subjects: AI; Artificial Intelligence

Description / Details

It has generally been assumed that geopolitical bias in language models originates from the training data used during the pre-training phase. We tested seven open-weight LLM pairs consisting of the base model (pre-training only) and the chat model (pre-training and post-training) from seven labs on a paired-scenario forced-choice probe over 28 country pairs in English, French, and Chinese, and found that geopolitical bias originates in post-training rather than in pre-training. Across seven AI labs, six showed shifts in the direction associated with the country or region of the model developer after post-training. This shift is strongest in Alibaba's Qwen 2.5: while the base is neutral on China-favourability (-0.15 log-odds, p=0.15), the post-trained chat variant is at +2.91 (p<10^-4), an 18x shift in odds. We also observe shifts in biases toward other countries across all models. Additionally, the magnitude of this shift depends on the language used to prompt the model: the French-made Mistral becomes pro-France only under French prompting (FR-EN shift +1.91, p<10^-4). These findings suggest that geopolitical preferences in language models are not simply inherited from large-scale internet data but are actively shaped during post-training, highlighting the need for greater transparency, auditing, and oversight of alignment processes that influence how models represent nations, cultures, and political perspectives.


Source: arXiv:2605.23825v1 - http://arxiv.org/abs/2605.23825v1 PDF: https://arxiv.org/pdf/2605.23825v1 Original Link: http://arxiv.org/abs/2605.23825v1

Please sign in to join the discussion.

No comments yet. Be the first to share your thoughts!

Access Paper
View Source PDF
Submission Info
Date:
May 25, 2026
Topic:
Artificial Intelligence
Area:
AI
Comments:
0
Bookmark
It's the humans, not the data: Geopolitical bias in LLMs originates in post-training, amplified by the language of the prompt | Researchia