The Delivery Giant's Gambit: How Meituan's LongCat-2.0 Proved China Can Train Frontier AI Without Nvidia
Meituan — better known for ferrying dumplings across Chinese cities — has open-sourced LongCat-2.0, a 1.6-trillion-parameter coding model trained entirely on domestic Chinese chips, that quietly topped OpenRouter's agent leaderboards for two months under a pseudonym. The release is the most concrete evidence yet that U.S. export controls have not foreclosed China's path to frontier-scale AI.
Elena Vance🇬🇧 Frontier CorrespondentJul 4, 2026 11m readThe Delivery Giant's Gambit: How Meituan's LongCat-2.0 Proved China Can Train Frontier AI Without Nvidia
For the better part of two months, a model called "Owl Alpha" quietly climbed the rankings on OpenRouter, the popular model-routing platform used by developers worldwide. It ranked first on the Hermes Agent leaderboard, second on Claude Code, and third in OpenClaw deployments. It processed over 10 trillion tokens monthly, growing at 200% month-on-month. Nobody knew who built it.
The reveal, when it came in late June 2026, was not from OpenAI, Anthropic, or Google DeepMind. It was from Meituan — China's dominant food delivery and local services conglomerate, a company whose primary business involves dispatching couriers on electric scooters to deliver hot pot and bubble tea across Chinese cities. The model was LongCat-2.0↗, a 1.6-trillion-parameter Mixture-of-Experts system trained entirely on domestic Chinese hardware, now open-sourced under the MIT licence.
The geopolitical implications arrived before the technical ones. Washington has spent three years and considerable diplomatic capital constructing an export-control regime designed to prevent China from acquiring the compute necessary to train frontier-scale AI. The South China Morning Post↗ called it "China's biggest AI model trained on local chips." The more precise framing is this: LongCat-2.0 is the most concrete, independently verifiable evidence yet that the export-control strategy has not foreclosed China's path to frontier-scale AI. It has merely redirected it.
The Architecture of Circumvention
The technical story of LongCat-2.0 begins not with the model itself but with the infrastructure that produced it. Meituan trained the model on a cluster of more than 50,000 domestic Chinese AI ASICs, completing both the full pre-training run — consuming over 35 trillion tokens — and the inference pipeline without a single Nvidia GPU. While Meituan has not officially named its chip supplier, industry analysts and reporting from Geopolitechs↗ and SiliconAngle↗ point toward Huawei's Atlas-950 SuperPods and the Huawei Collective Communication Library as the likely substrate.
This is not the first time a Chinese lab has demonstrated frontier-scale training on domestic hardware. Zhipu AI's GLM-5, a 744-billion-parameter MoE model released in February 2026, was trained on a cluster of 100,000 Huawei Ascend 910B chips. But LongCat-2.0 is larger, more capable on the benchmarks that matter to developers, and — crucially — open-sourced. The combination of scale, performance, and permissive licensing makes it a different kind of signal.
How the Architecture Works
LongCat-2.0 is a Mixture-of-Experts model with 1.6 trillion total parameters, but the figure that matters for inference cost is the dynamic activation range: between 33 billion and 56 billion parameters per token, depending on query complexity. The model also incorporates N-gram Embedding, adding 135 billion parameters specifically to capture local token relationships — a technique that improves performance on code and structured text without proportionally increasing inference cost.
The context window is a native 1 million tokens, supported by a proprietary mechanism called LongCat Sparse Attention (LSA). LSA achieves linear-complexity attention through three interlocking techniques:
- Streaming-aware Indexing, which optimises High Bandwidth Memory utilisation during long-context inference
- Cross-Layer Indexing, which reuses saliency patterns across transformer layers to reduce redundant computation
- Hierarchical Indexing, a two-stage token selection process that identifies the most relevant tokens before full attention is computed
The model's training objective was shaped by a strategy Meituan calls Multi-Teacher Optimization via Mixture of Specialized Experts (MOPD), which isolates and optimises three distinct functional domains: Agent Experts (tool invocation and multi-step execution), Reasoning Experts (STEM problem-solving and formal logic), and Interaction Experts (alignment and hallucination suppression). The result is a model that does not merely perform well on aggregate benchmarks but is architecturally oriented toward the specific demands of agentic coding workflows.
The Benchmarks — and Why to Read Them Carefully
According to VentureBeat's reporting↗ and the official LongCat documentation↗, the model's headline numbers are:
- SWE-bench Pro: 59.5 (versus GPT-5.5's 58.6 and Gemini 3.1 Pro's reported score)
- Terminal-Bench 2.1: 70.8
- FORTE (Multilingual): 77.3
The SWE-bench Pro figure is the one that will attract the most attention, and the most scrutiny. SWE-bench measures a model's ability to resolve real GitHub issues — a task that requires understanding large codebases, identifying the relevant files, writing a patch, and passing the associated test suite. A score of 59.5 is genuinely competitive with the current frontier. But benchmark results published by the model's own developer, without independent replication, warrant the usual caution. The more compelling evidence of LongCat-2.0's real-world utility is the two months it spent as "Owl Alpha" on OpenRouter, where it was selected by developers building production agents without any knowledge of its provenance.
"The stealth deployment was not a marketing stunt — it was an engineering validation," noted one analysis from Geopolitechs↗. "By the time Meituan revealed the model's identity, it had already proven its stability and utility on domestic infrastructure under real production load."
That validation matters. A model that tops leaderboards in a lab environment but fails under the latency and throughput demands of production agentic workflows is of limited commercial value. LongCat-2.0 demonstrated the opposite: it was already the backbone of agents like Hermes Agent and a top-ranked model on Claude Code before anyone knew Meituan had built it.
The Pricing Signal
Meituan has priced LongCat-2.0 aggressively. Standard API access is $0.75 per million input tokens and $2.95 per million output tokens, with promotional rates as low as $0.30 and $1.20 respectively. The company has also introduced a zero-charge policy for context cache hits — a feature specifically designed to reduce costs for iterative agentic workflows where the same long context is repeatedly queried with small modifications.
For comparison, Anthropic's Claude Sonnet 5 is priced at $2 per million input tokens and $10 per million output tokens. OpenAI's GPT-5.5 sits at a similar tier. LongCat-2.0's pricing is not merely competitive — it is a deliberate attempt to undercut the Western frontier on cost while matching it on capability. For enterprises running high-volume agentic pipelines, the economics are difficult to ignore.
Meituan's Unlikely Trajectory
The company behind LongCat-2.0 is not a frontier AI lab. Meituan is China's dominant local services platform — the company that coordinates millions of delivery orders daily, optimises courier routing in real time, and has built one of the world's most sophisticated logistics operations. Its pivot toward AI is not incidental to that business; it is an extension of it.
The company acquired the AI startup Light Year Beyond for $281 million in 2023, seeding its internal research capability. By 2025, it had committed over 10 billion yuan annually to AI development as part of a broader 21.1 billion yuan R&D budget — a figure that would be remarkable for a dedicated AI lab, let alone a food delivery company. The LongCat model family began with LongCat-Flash in August 2025, a 560-billion-parameter model for general-purpose tasks, followed by LongCat-Flash-Thinking in September 2025, a reasoning-specialised variant. LongCat-2.0 is the third generation, and the first to operate at a scale that places it in direct competition with the Western frontier.
The strategic logic is clear: Meituan's core business depends on AI for logistics optimisation, demand forecasting, and customer service at scale. Building proprietary frontier models is not a vanity project — it is vertical integration of the most critical input to its operations.
The open-source release under the MIT licence is a separate strategic calculation. By releasing model weights freely, Meituan builds developer goodwill, accelerates ecosystem adoption, and positions itself as a credible infrastructure provider in the Chinese AI market — a role that has been largely occupied by Alibaba (via Qwen) and Baidu (via ERNIE). The MIT licence is notably more permissive than the licences attached to many Western open-weight releases, which typically include commercial use restrictions.
The Export Control Question
The release of LongCat-2.0 arrives at a moment of intense debate about the efficacy of U.S. semiconductor export controls. The Biden administration's October 2022 and October 2023 chip rules, extended and tightened by the Trump administration in 2025, were designed to prevent Chinese entities from acquiring the compute necessary to train frontier-scale models. The underlying assumption was that without access to Nvidia's H100 and H200 GPUs — and their successors — Chinese labs would be unable to close the capability gap with their Western counterparts.
LongCat-2.0 does not definitively refute that assumption, but it complicates it significantly. The model was trained on domestic ASICs that, on a per-chip basis, offer substantially lower performance than Nvidia's restricted hardware. The Huawei Ascend 910B, for instance, delivers approximately 320 TFLOPS of FP16 performance — roughly comparable to an Nvidia A100, a chip that predates the most stringent export controls. The gap to Nvidia's current Blackwell generation is substantial.
What Meituan and other Chinese labs have demonstrated is that the gap in per-chip performance can be partially compensated by:
- Scale: deploying clusters of 50,000 to 100,000 domestic chips rather than the smaller, denser clusters that Nvidia hardware enables
- Architectural efficiency: MoE designs that activate only a fraction of total parameters per token, reducing the effective compute requirement per inference
- Software optimisation: custom communication libraries and training frameworks (such as Huawei's MindSpore and CANN toolchain) that extract maximum utilisation from available hardware
- Data quality: investing in curated, high-quality training corpora rather than relying on raw compute to compensate for data noise
The result is not parity with the Western frontier — GPT-5.6 Sol and Claude Fable 5 remain ahead on the most demanding reasoning benchmarks. But it is close enough to be commercially viable, and the trajectory is improving faster than the export-control regime can adapt.
What This Means for Western Policy
The policy implications are uncomfortable. If frontier-scale AI training is achievable on domestic Chinese hardware — even at a performance discount — then the export-control strategy has not prevented China from developing capable AI. It has imposed costs and delays, but it has also accelerated China's investment in domestic semiconductor capability, creating a long-term strategic liability for the United States.
SiliconAngle's analysis↗ noted that while Nvidia still maintains significant market share in China through grey-market channels and older hardware, the success of models like LongCat-2.0 signals a shrinking window for U.S. hardware dominance. As domestic capacity increases and dependence on Western GPU clusters diminishes, the leverage that export controls provide will erode.
Implications for Developers and Enterprises
For developers outside China, LongCat-2.0 presents a straightforward proposition: a frontier-competitive coding model, available via OpenRouter↗ and Meituan's own API endpoints, at prices that undercut the Western frontier by a factor of three to four. The MIT licence means it can be used commercially without restriction, and the model's two-month track record as "Owl Alpha" provides a degree of real-world validation that most newly released models lack.
The caveats are real. The model weights are not yet available for self-hosting — Meituan has open-sourced the architecture and released API access, but the full weights are listed as "coming soon." For enterprises with data sovereignty requirements or latency constraints that preclude API-based inference, this limits immediate utility. There are also the standard concerns about supply-chain trust that apply to any model from a Chinese developer operating under Chinese data laws.
For the broader AI industry, the more significant implication is competitive. The Western frontier labs have spent the past two years arguing, implicitly or explicitly, that their models justify premium pricing because they are uniquely capable. LongCat-2.0's SWE-bench Pro score of 59.5 — edging past GPT-5.5's 58.6 — challenges that narrative directly on the benchmark that matters most to the developer community. Whether the gap holds under independent evaluation remains to be seen. But the direction of travel is clear: the frontier is no longer a Western monopoly, and the pricing power that monopoly conferred is under pressure from an unexpected direction.
A food delivery company from Beijing, training on chips that Washington tried to keep out of reach, has just made the most consequential open-source AI release of the year. The gilded cage of government-gated frontier AI that OpenAI and Anthropic are navigating in Washington looks rather different from the vantage point of a 50,000-chip cluster in Shenzhen.
---
*Elena Vance is Neuron's Frontier Correspondent, based in London.*
Links & Resources
External links — opens in a new tab

🇬🇧 Frontier Correspondent · London, UK
Watches the frontier labs and reads research papers so you don’t have to.

Topological Invariants and Differential Topology
by Richard Murdoch Montgomery
A treatise on smooth manifolds, characteristic classes, and cohomology — topological methods applied to physics and data science.

Electrophysiological Biomarkers of Neuropsychiatric Brain Dynamics Vol 1
by Richard Murdoch Montgomery
EEG-based biomarkers for schizophrenia and bipolar disorder — frequency band power, event-related potentials, and neural connectivity patterns.

The Future of Scientific Discourse
by Richard Murdoch Montgomery
Transparent, AI-augmented peer review models for the 21st century — open science, reproducibility, and the democratisation of knowledge.

The Casio fx-CG50: A Comprehensive Academic Treatise
by Richard Murdoch Montgomery
A 223-page deep dive into hardware architecture, statistical analysis, matrix operations, and Casio BASIC programming.
Comments
Open discussion — no account needed. Be respectful.
More from Main AI News
While the Giants Sleep: Mistral and Meta Are Rewriting the Rules of AI Specialization
With OpenAI, Anthropic, and Google DeepMind in a rare holiday lull, Mistral AI and Meta have seized the narrative with two radically different bets on the future: a 119-billion-parameter proof-engineering machine and a physics-aware video model that could redefine embodied intelligence. Here's why the quietest week of the year just became the most strategically revealing.
Marcus OkaforAlibaba Bans Anthropic AI, Citing 'Backdoor' Risks in Escalating US-China Tech Feud
Chinese tech giant Alibaba has ordered a company-wide ban on all Anthropic AI tools, effective July 10, 2026, citing 'backdoor' security risks in Claude Code — a counterpunch to Anthropic's explosive allegation that Alibaba-linked operators ran the largest known AI model distillation attack in history. The feud exposes deep fractures in the global AI supply chain and forces a reckoning on enterprise AI trust, security, and procurement worldwide.
Marcus OkaforThe Sovereign-AI Bargain: Why OpenAI's 5% Offer to Washington Changes Everything
Bloomberg's report that OpenAI has discussed giving the U.S. government a five per cent equity stake is not a story about corporate finance — it is the formal consecration of a new kind of political entity: the quasi-national AI champion. The sovereign-AI bargain has found its share price, and the implications for every developer, enterprise, and policymaker are profound.
Elena Vance