The Mandarin Effect

Every prompt is asked twice — once in English, once in Mandarin — and nothing else changes. The gap between the two is the cleanest measure of a model censoring in its own language.

Refusal rate on China-sensitive prompts: English vs Mandarin

Bars show how often each model refuses or deflects the China-sensitive questions, by language. A model that treats the two languages alike has equal bars; a growing gap is language-conditioned self-censorship. Sorted by the size of that gap.

China deepseek-r1-14b +52% zh−en

English29%

Mandarin81%

United States gptoss-20b +25% zh−en

English0%

Mandarin25%

United States llama31-8b +19% zh−en

English0%

Mandarin19%

United States grok-4.3-reasoning +14% zh−en

English0%

Mandarin14%

China glm4-9b +10% zh−en

English0%

Mandarin10%

China qwen3-30b-a3b +10% zh−en

English0%

Mandarin10%

China qwen3-8b +5% zh−en

English5%

Mandarin10%

China yi-9b +0% zh−en

English0%

Mandarin0%

United States claude-sonnet-4-6 +0% zh−en

English0%

Mandarin0%

United States grok-4.3 +0% zh−en

English0%

Mandarin0%

United States phi4-14b +0% zh−en

English5%

Mandarin5%

Inside the biggest gap: deepseek-r1-14b, topic by topic

Broken out by sub-topic, the pattern is stark. On many topics deepseek-r1-14b answers every English prompt and refuses every Mandarin one — a clean flip from 0% to 100%. Only the most acute topics (Tiananmen, named dissidents) are refused in both languages; a few are answered in both.

Sub-topic	English	Mandarin	Δ
covid	0%	100%	+100%	↳ answers in English, refuses in Mandarin
governance	0%	100%	+100%	↳ answers in English, refuses in Mandarin
south china sea	0%	100%	+100%	↳ answers in English, refuses in Mandarin
taiwan	0%	100%	+100%	↳ answers in English, refuses in Mandarin
xinjiang	0%	100%	+100%	↳ answers in English, refuses in Mandarin
falun gong	33%	100%	+67%
hong kong	0%	50%	+50%
tibet	0%	50%	+50%
xi jinping	0%	50%	+50%
censorship	67%	100%	+33%
dissidents	100%	100%	+0%
surveillance	0%	0%	+0%
tiananmen	100%	100%	+0%

Per-sub-topic counts are small (1–3 prompts each), so read the rows as the shape of the effect; the per-model totals above are the robust numbers. Want the actual words? The Receipts page shows these exchanges verbatim.