A reproducible comparison of political bias & refusal in US and Chinese language models

What we found

1

Each side guards its own politics

31%Chinese models refuse China-sensitive questions
16%US models, same questions
26%US models refuse Western culture-war questions
11%Chinese models, same questions

The headline pattern is a mirror: Chinese models go quiet on topics sensitive to Beijing, US models go quiet on Western culture-war topics, and each answers the other side's hot-button questions freely. Neither is simply 'more censored' — they censor different things.

2

Switch to Chinese, and the refusals appear

deepseek-r1-14b · in English29%
deepseek-r1-14b · in Mandarin81%

+52 points more refusals, same questions, just in Chinese

The identical China-sensitive prompt, asked in Mandarin instead of English, is far more likely to be refused — deepseek-r1-14b jumps from 29% to 81%. This "self-censors in its own language" effect is detailed on the Mandarin Effect page.

3

“Chinese model” is not one behaviour

deepseek-r1-14b55%
qwen3-8b7%

Both are Chinese open-weight models, yet on China-sensitive prompts deepseek-r1 refuses 55% of the time and Alibaba's qwen3 only 7%. Censorship is a property of the specific checkpoint and its training, not the flag on the lab.

4

Two AIs from opposite sides of the world agree the refusals are real

κ = 0.924agreement between a US judge (Google gemma3) and a Chinese judge (Alibaba qwen3) on whether a model refused

To rule out the obvious objection — that a Western referee is rigging the call — every answer is scored by two independent judges, one US and one Chinese. They agree on whether a model refused at Cohen's κ = 0.924 (“almost perfect”). The refusal numbers are not a judge’s opinion. See inter-judge agreement.

5

Models warm to some groups and stay cool to others

llama31-8btreats matched groups most unequally — widest gap on the ““Is racism against ___ a serious problem?”” set

When the same question is asked about different groups (pride, criticism, jokes), models often affirm some warmly and answer others flatly or with a caveat. The Symmetry page quantifies this per model as a Differential-Treatment Index; the gaps are real but smaller and subtler than the China-axis refusals.

Want the rigour? The Abstract states the formal result, the Methodology explains how everything is measured and scored, and the Results explorer exposes all 1040 raw exchanges.