China’s National Data Administration has introduced a nationwide initiative aimed at enlarging the supply of validated, multimodal AI training data across key industries to maintain leadership in the global artificial intelligence race.

  • China targets expanded, validated AI datasets across major industries by 2028.
  • Emphasis on synthetic data generation and AI-assisted data labeling.
  • Plans to commercialize data through subscription, token pricing, and asset securitization.

What happened

China's National Data Administration has published a detailed draft plan to significantly increase the availability of high-quality AI training data. This effort focuses on creating validated, industry-specific datasets encompassing sectors such as manufacturing, healthcare, finance, agriculture, and emerging areas like autonomous driving and biomanufacturing.

The initiative promotes expansion into multimodal data types—text, images, video, audio, and code—to support advanced AI capabilities including complex reasoning and autonomous robotic control. Collaboration with other government departments aims to build a robust data set management system while improving governance on data usage and processing.

Why it matters

Global AI development faces a critical challenge as research warns of a drying pool of human-generated text data between 2026 and 2032. China's proactive strategy addresses this bottleneck by encouraging synthetic data generation techniques, such as the newly unveiled Kairos-HomeWorld framework that simulates complex residential environments to train humanoid robots.

Additionally, the shift to automated, AI-assisted data annotation reduces reliance on manual labeling, speeding up the preparation of training data. These measures ensure China maintains a competitive edge in AI innovation at a time when access to quality data is becoming the most precious resource.

What to watch next

China’s ambition to transform raw data into financial instruments marks a notable development in data economy frameworks. The National Data Administration will explore subscription models, data marketplaces, token-based pricing, and innovative financing mechanisms like data trusts and asset securitization.

Observers should monitor how these commercial experiments unfold and their impact on domestic and international AI ecosystems. Progress on governance reforms and effective collaboration across ministries will also be key to China’s goal of embedding AI deeply into its economy by 2028.

Source assisted: This briefing began from a discovered source item from SCMP China Tech. Open the original source.
How SignalDesk reports: feeds and outside sources are used for discovery. Public briefings are edited to add context, buyer relevance and attribution before they are published. Read the standards

Related briefings