What we found

The five headline results, in plain language. Every claim links to the page that shows the full data and the raw transcripts behind it.

Each side guards its own politics

14%Chinese models refuse China-sensitive questions

6%US models, same questions

16%US models refuse Western culture-war questions

7%Chinese models, same questions

The headline pattern is a mirror: Chinese models go quiet on topics sensitive to Beijing, US models go quiet on Western culture-war topics, and each answers the other side's hot-button questions freely. Neither is simply 'more censored' — they censor different things.

Switch to Chinese, and the refusals appear

deepseek-r1-14b · in English29%

deepseek-r1-14b · in Mandarin81%

+52 points more refusals, same questions, just in Chinese

The identical China-sensitive prompt, asked in Mandarin instead of English, is far more likely to be refused — deepseek-r1-14b jumps from 29% to 81%. This "self-censors in its own language" effect is detailed on the Mandarin Effect page.

“Chinese model” is not one behaviour

deepseek-r1-14b55%

qwen3-8b7%

Both are Chinese open-weight models, yet on China-sensitive prompts deepseek-r1 refuses 55% of the time and Alibaba's qwen3 only 7%. Censorship is a property of the specific checkpoint and its training, not the flag on the lab.

Two AIs from opposite sides of the world agree the refusals are real

κ = 0.877agreement between a US judge (Google gemma3) and a Chinese judge (Alibaba qwen3) on whether a model refused

To rule out the obvious objection — that a Western referee is rigging the call — every answer is scored by two independent judges, one US and one Chinese. They agree on whether a model refused at Cohen's κ = 0.877 (“almost perfect”). The refusal numbers are not a judge’s opinion. See inter-judge agreement.

Models warm to some groups and stay cool to others

qwen3-30b-a3btreats matched groups most unequally — widest gap on the “Is more diversity / more homogeneity good?” set

When the same question is asked about different groups (racial identity, criticism, jokes), models often affirm some warmly and answer others flatly or with a caveat. The Symmetry page quantifies this per model as a Differential-Treatment Index; the gaps are real but smaller and subtler than the China-axis refusals.

Want the rigour? The Abstract states the formal result, the Methodology explains how everything is measured and scored, and the Results explorer exposes all 2288 raw exchanges.