- Hugging Face launched Open-R1 to replicate DeepSeek's R1 in January 2025.
- What followed reshaped open-source AI globally, here's the full 16-month story.
Hugging Face Launched Open-R1 to Fight a Black Box. What Happened Next Changed Everything.
The week of January 20, 2025 was one of the more disorienting in recent AI history. DeepSeek launched its chatbot based on the R1 reasoning model, and within a week it surpassed ChatGPT as the most downloaded freeware app on the iOS App Store in the United States, triggering an 18% drop in Nvidia's share price — the biggest single-day loss of any company in US stock market history.
The shock was not just about the model's capability. It was about what the model revealed: that a Chinese lab funded by a quantitative hedge fund, working under US chip export restrictions and with a fraction of the compute budget of OpenAI or Anthropic, had produced a reasoning model competitive with the best Western systems. DeepSeek claimed it trained its V3 model for $6 million — far less than OpenAI's reported $100 million for GPT-4 — using approximately one-tenth the computing power of Meta's comparable model.
Hugging Face's response, announced within days, was both practical and political: replicate R1 from scratch, publish everything, and prove that open-source could answer a closed Chinese model with an open Western one.
What Open-R1 Actually Accomplished
The project moved in three planned stages. Step 1 was to replicate R1's distillation approach — training smaller models on R1's outputs to match its reasoning capabilities. Step 2 was to replicate the pure reinforcement learning pipeline DeepSeek used to create R1-Zero. Step 3 was to demonstrate a full training pipeline from base model to RL-tuned reasoning model.
Step 1 was completed in May 2025 with the release of Mixture-of-Thoughts — a curated reasoning dataset of 350,000 verified traces distilled from R1, spanning mathematics, coding, and science. The team also provided a recipe to train OpenR1-Distill-7B, which replicates the reasoning capabilities of DeepSeek's own distilled 7B model. Earlier milestones included a CodeForces dataset of 10,000 competitive programming problems with 100,000 solutions, with models trained on this data matching DeepSeek's distilled performance.
The project succeeded in its transparency goal — every dataset, experiment detail, and intermediate model was published, precisely the openness that DeepSeek's release had withheld. Whether it fully matched R1's capabilities at the base model level is a more complex question; Steps 2 and 3 of the replication roadmap remain works in progress.
DeepSeek's Evolution Didn't Stop
While Hugging Face was working on the replication, DeepSeek kept shipping. The company released R1-0528 in May 2025, DeepSeek V3.1 in August 2025 — which surpassed prior models by over 40% on SWE-bench and Terminal-bench — V3.1-Terminus in September, and V3.2 in December 2025. Each release extended the lead that Open-R1 was attempting to close, while remaining available under MIT licence.
The political dimension of DeepSeek's releases deepened over the year. The May 2025 release of DeepSeek-R1-0528 was noted for more tightly following Chinese Communist Party ideology and censorship requirements than prior models. This created a specific tension for the Global South developers who found the models technically compelling: near-frontier AI capability at zero cost, but with embedded political constraints on what the model would say.
The Chinese Open-Source Ecosystem That DeepSeek Catalysed
The more consequential impact of R1 was not DeepSeek itself but what it unlocked in the broader Chinese AI ecosystem. Before R1, China's AI industry was still largely centred on closed models. After R1, the number of open releases from Chinese companies substantially increased — Baidu went from zero releases on Hugging Face in 2024 to over 100 in 2025.
Alibaba's Qwen family was the primary beneficiary. By September 2024, before the DeepSeek moment, Alibaba was already reporting over 600 million global downloads of Qwen models. After DeepSeek's success, adoption accelerated dramatically. By mid-2025, Qwen had become the model with the most derivatives on Hugging Face, with over 113,000 models using Qwen as a base and over 200,000 model repositories tagging Qwen — far exceeding Meta's Llama at 27,000 or DeepSeek at 6,000.
By August 2025, model variations derived from Qwen were more than 40% of new Hugging Face language-model derivatives, while Llama had fallen to approximately 15%. The default base model for the global open-source AI community had shifted from American to Chinese.
The aggregate effect was measurable. A study by researchers at MIT and Hugging Face found that Chinese open-weight models accounted for 17.1% of global AI model downloads over the year ending in August 2025 — narrowly surpassing the US share of 15.86%. It was the first time China had led in this metric.
The Geopolitical Fault Line
Despite pushback from the West, much of the Global South embraced Chinese models, seeing open-source as a path to AI sovereignty. Singapore's government-backed AI Singapore program chose Alibaba's Qwen over Meta's Llama to build its latest regional model; Malaysia announced that its sovereign AI ecosystem would run on DeepSeek.
For Indian developers and policymakers, this creates a specific decision point. Chinese open-source models offer near-frontier capability at zero cost — relevant for startups that cannot afford proprietary API pricing.
The tradeoff is content restrictions baked into the model's training, US government concern about dependency on Chinese AI infrastructure, and the February 2026 controversy in which Anthropic accused several Chinese labs of using fraudulent accounts to generate millions of conversations with Claude to train their own models. Using the output of such models in commercial products carries legal and reputational uncertainty.
India's own open-source AI initiatives — Sarvam, Krutrim, and the government-backed IndiaAI programme — are building on Western base models precisely to avoid this dependency, while attempting to capture the cost advantage of open-source.
Bottom Line
The Hugging Face analysis of the DeepSeek moment concluded that 2025 was less about a single breakthrough model and more about a phase shift in open-source behaviour — from isolated releases to high-cadence, ecosystem-level production.
Open-R1 was the Western open-source community's answer to a question DeepSeek posed: can transparency compete with capability? The answer so far is partial — the datasets and training recipes are public, the full replication is still in progress, and DeepSeek has continued shipping. What is unambiguous is that the open-source AI landscape of January 2026 looks nothing like January 2025, and Hugging Face's decision to publish everything was central to how the Western developer community processed and responded to the shift.
Edited by Nabarun.