1 │ Backdrop — Huang’s Warning on Capitol Hill

During an impromptu interview in Washington, NVIDIA CEO Jensen Huang told reporters that “China is right behind us — the gap is very, very small.” He added that sweeping export curbs on AI chips could hurt U.S. firms more than they slow Chinese progress. Business Insider


2 │ Hardware Race — Domestic Chips + Rack‑Scale Clusters

SystemScale / Peak FP16*Key Take‑awaysWhy It Matters
Huawei CloudMatrix 384384 × Ascend 910C, 300 PFLOPS (≈ 67 % > NVIDIA NVL72)Optical “super‑node” interconnect, 3× NVL72 memory, but higher power draw and $8.2 M price tagFirst Chinese rack that can match a top U.S. GPU pod at system level SemiAnalysis
Baidu Kunlun P800 Cluster30 000 Kunlun P800 chipsClaimed to train 100 B‑plus‑parameter models or fine‑tune 1 000 tasks in parallel; already live in banks & cloudShows Chinese vendors can scale bespoke silicon to hyperscale size Reuters

*FP16/BF16 mixed precision commonly used for large‑model training.


3 │ Model Benchmarks — Open‑Source Titles Hitting the Leaderboards

ModelParametersLatest Public Score
DeepSeek Janus‑Pro‑7B7 BGenEval image‑generation 0.80, beating DALL‑E 3 (0.67) & SDXL (0.74) Hugging Facejanusai.pro
Alibaba Qwen 2.5 Omni3 B → 72 B seriesJumped to #1 on Hugging Face open‑source board in early April 2025 Hugging Face

Algorithm tweaks and massive instruction data let smaller Chinese models equal or surpass U.S. peers of the same size.


4 │ Market Flywheel — Hundred‑Million‑User Apps Feed More Data

In the January 2025 Global AI‑App MAU chart, China took two of the top five slots:

  • Doubao (ByteDance) — 78 M MAU
  • DeepSeek — 34 M MAU

That scale supplies fresh conversation data and third‑party plug‑in traffic, accelerating iterative improvement. BacklinkoBacklinko


5 │ Patent & R&D Density — “Quantity Has a Quality of Its Own”

WIPO’s 2024 landscape report shows 38 000 GenAI patent families from China (2014‑2023), six times the U.S. count. WIPO
Generous local subsidies mean more experiments move from paper to product.


6 │ Compute Infrastructure — The 300 EFLOPS National Target

China’s six‑ministry action plan aims for ≥ 300 EFLOPS of aggregate compute by 2025, up from 197 EFLOPS in 2023. english.www.gov.cn

What’s an EFLOPS? FLOPS are floating‑point operations per second.
Exa‑ (10¹⁸) means 1 EFLOPS = 1 quintillion (a million‑trillion) calculations every second — the “exascale” class occupied today by U.S. supercomputer Frontier (≈ 1.1 EFLOPS). Hitting 300 EFLOPS would be like running almost 300 Frontiers in parallel.

Generous land, power‑price and financing incentives are driving a wave of cool‑climate AI data‑center construction to reach that figure.


7 │ Cost Efficiency — DeepSeek‑V3 Breaks the “Billion‑Dollar Club”

DeepSeek reports training a 671 B‑parameter MoE with 2.78 M H800 GPU‑hours (≈ 57 days, US $5.6 M) — orders of magnitude below the rumored GPT‑4 bill. Unite.AI
Even if that excludes R&D trial runs, it proves that careful data‑mixing and sparse routing can slash capital needs.


8 │ Remaining Gaps

AreaCurrent Bottleneck
Advanced ProcessAscend 910C still 7 nm; energy‑efficiency lags 4 nm Blackwell‑class GPUs
Software EcosystemMindSpore / CANN maturity trails CUDA stack; porting costs remain high
Talent GravityTop researchers still cluster in U.S. big‑tech labs; long‑term flow uncertain

9 │ Overall Assessment

  • System‑level stacking & optical links let China field usable top‑tier compute despite chip sanctions.
  • Open‑source + community tactics propel benchmark parity and rapid app diffusion.
  • State‑backed infrastructure (300 EFLOPS) supplies the “power station” for continued scaling.

Bottom line: the contest has shifted from a “generation gap” to a neck‑and‑neck sprint. Future separation will hinge on access to sub‑4 nm manufacturing, energy efficiency breakthroughs, and how export‑control dynamics evolve. For global firms, risk‑balanced strategies that tap both ecosystems are fast becoming table stakes.