The Delivery Giant's Gambit: How Meituan's LongCat-2.0 Proved China Can Train Frontier AI Without Nvidia

Meituan — better known for ferrying dumplings across Chinese cities — has open-sourced LongCat-2.0, a 1.6-trillion-parameter coding model trained entirely on domestic Chinese chips, that quietly topped OpenRouter's agent leaderboards for two months under a pseudonym. The release is the most concrete evidence yet that U.S. export controls have not foreclosed China's path to frontier-scale AI.

Elena Vance🇬🇧 Frontier CorrespondentJul 4, 2026 11m read

The Delivery Giant's Gambit: How Meituan's LongCat-2.0 Proved China Can Train Frontier AI Without Nvidia

For the better part of two months, a model called "Owl Alpha" quietly climbed the rankings on OpenRouter, the popular model-routing platform used by developers worldwide. It ranked first on the Hermes Agent leaderboard, second on Claude Code, and third in OpenClaw deployments. It processed over 10 trillion tokens monthly, growing at 200% month-on-month. Nobody knew who built it.

The reveal, when it came in late June 2026, was not from OpenAI, Anthropic, or Google DeepMind. It was from Meituan — China's dominant food delivery and local services conglomerate, a company whose primary business involves dispatching couriers on electric scooters to deliver hot pot and bubble tea across Chinese cities. The model was LongCat-2.0↗, a 1.6-trillion-parameter Mixture-of-Experts system trained entirely on domestic Chinese hardware, now open-sourced under the MIT licence.

The geopolitical implications arrived before the technical ones. Washington has spent three years and considerable diplomatic capital constructing an export-control regime designed to prevent China from acquiring the compute necessary to train frontier-scale AI. The South China Morning Post↗ called it "China's biggest AI model trained on local chips." The more precise framing is this: LongCat-2.0 is the most concrete, independently verifiable evidence yet that the export-control strategy has not foreclosed China's path to frontier-scale AI. It has merely redirected it.

The Architecture of Circumvention

The technical story of LongCat-2.0 begins not with the model itself but with the infrastructure that produced it. Meituan trained the model on a cluster of more than 50,000 domestic Chinese AI ASICs, completing both the full pre-training run — consuming over 35 trillion tokens — and the inference pipeline without a single Nvidia GPU. While Meituan has not officially named its chip supplier, industry analysts and reporting from Geopolitechs↗ and SiliconAngle↗ point toward Huawei's Atlas-950 SuperPods and the Huawei Collective Communication Library as the likely substrate.

This is not the first time a Chinese lab has demonstrated frontier-scale training on domestic hardware. Zhipu AI's GLM-5, a 744-billion-parameter MoE model released in February 2026, was trained on a cluster of 100,000 Huawei Ascend 910B chips. But LongCat-2.0 is larger, more capable on the benchmarks that matter to developers, and — crucially — open-sourced. The combination of scale, performance, and permissive licensing makes it a different kind of signal.

How the Architecture Works

LongCat-2.0 is a Mixture-of-Experts model with 1.6 trillion total parameters, but the figure that matters for inference cost is the dynamic activation range: between 33 billion and 56 billion parameters per token, depending on query complexity. The model also incorporates N-gram Embedding, adding 135 billion parameters specifically to capture local token relationships — a technique that improves performance on code and structured text without proportionally increasing inference cost.

The context window is a native 1 million tokens, supported by a proprietary mechanism called LongCat Sparse Attention (LSA). LSA achieves linear-complexity attention through three interlocking techniques:

Streaming-aware Indexing, which optimises High Bandwidth Memory utilisation during long-context inference
Cross-Layer Indexing, which reuses saliency patterns across transformer layers to reduce redundant computation
Hierarchical Indexing, a two-stage token selection process that identifies the most relevant tokens before full attention is computed

The model's training objective was shaped by a strategy Meituan calls Multi-Teacher Optimization via Mixture of Specialized Experts (MOPD), which isolates and optimises three distinct functional domains: Agent Experts (tool invocation and multi-step execution), Reasoning Experts (STEM problem-solving and formal logic), and Interaction Experts (alignment and hallucination suppression). The result is a model that does not merely perform well on aggregate benchmarks but is architecturally oriented toward the specific demands of agentic coding workflows.

The Benchmarks — and Why to Read Them Carefully

According to VentureBeat's reporting↗ and the official LongCat documentation↗, the model's headline numbers are:

SWE-bench Pro: 59.5 (versus GPT-5.5's 58.6 and Gemini 3.1 Pro's reported score)
Terminal-Bench 2.1: 70.8
FORTE (Multilingual): 77.3

The SWE-bench Pro figure is the one that will attract the most attention, and the most scrutiny. SWE-bench measures a model's ability to resolve real GitHub issues — a task that requires understanding large codebases, identifying the relevant files, writing a patch, and passing the associated test suite. A score of 59.5 is genuinely competitive with the current frontier. But benchmark results published by the model's own developer, without independent replication, warrant the usual caution. The more compelling evidence of LongCat-2.0's real-world utility is the two months it spent as "Owl Alpha" on OpenRouter, where it was selected by developers building production agents without any knowledge of its provenance.

"The stealth deployment was not a marketing stunt — it was an engineering validation," noted one analysis from Geopolitechs↗. "By the time Meituan revealed the model's identity, it had already proven its stability and utility on domestic infrastructure under real production load."

That validation matters. A model that tops leaderboards in a lab environment but fails under the latency and throughput demands of production agentic workflows is of limited commercial value. LongCat-2.0 demonstrated the opposite: it was already the backbone of agents like Hermes Agent and a top-ranked model on Claude Code before anyone knew Meituan had built it.

The Pricing Signal

Meituan has priced LongCat-2.0 aggressively. Standard API access is $0.75 per million input tokens and $2.95 per million output tokens, with promotional rates as low as $0.30 and $1.20 respectively. The company has also introduced a zero-charge policy for context cache hits — a feature specifically designed to reduce costs for iterative agentic workflows where the same long context is repeatedly queried with small modifications.

For comparison, Anthropic's Claude Sonnet 5 is priced at $2 per million input tokens and $10 per million output tokens. OpenAI's GPT-5.5 sits at a similar tier. LongCat-2.0's pricing is not merely competitive — it is a deliberate attempt to undercut the Western frontier on cost while matching it on capability. For enterprises running high-volume agentic pipelines, the economics are difficult to ignore.

Meituan's Unlikely Trajectory

The company behind LongCat-2.0 is not a frontier AI lab. Meituan is China's dominant local services platform — the company that coordinates millions of delivery orders daily, optimises courier routing in real time, and has built one of the world's most sophisticated logistics operations. Its pivot toward AI is not incidental to that business; it is an extension of it.

The company acquired the AI startup Light Year Beyond for $281 million in 2023, seeding its internal research capability. By 2025, it had committed over 10 billion yuan annually to AI development as part of a broader 21.1 billion yuan R&D budget — a figure that would be remarkable for a dedicated AI lab, let alone a food delivery company. The LongCat model family began with LongCat-Flash in August 2025, a 560-billion-parameter model for general-purpose tasks, followed by LongCat-Flash-Thinking in September 2025, a reasoning-specialised variant. LongCat-2.0 is the third generation, and the first to operate at a scale that places it in direct competition with the Western frontier.

The strategic logic is clear: Meituan's core business depends on AI for logistics optimisation, demand forecasting, and customer service at scale. Building proprietary frontier models is not a vanity project — it is vertical integration of the most critical input to its operations.

The open-source release under the MIT licence is a separate strategic calculation. By releasing model weights freely, Meituan builds developer goodwill, accelerates ecosystem adoption, and positions itself as a credible infrastructure provider in the Chinese AI market — a role that has been largely occupied by Alibaba (via Qwen) and Baidu (via ERNIE). The MIT licence is notably more permissive than the licences attached to many Western open-weight releases, which typically include commercial use restrictions.

The Export Control Question

The release of LongCat-2.0 arrives at a moment of intense debate about the efficacy of U.S. semiconductor export controls. The Biden administration's October 2022 and October 2023 chip rules, extended and tightened by the Trump administration in 2025, were designed to prevent Chinese entities from acquiring the compute necessary to train frontier-scale models. The underlying assumption was that without access to Nvidia's H100 and H200 GPUs — and their successors — Chinese labs would be unable to close the capability gap with their Western counterparts.

LongCat-2.0 does not definitively refute that assumption, but it complicates it significantly. The model was trained on domestic ASICs that, on a per-chip basis, offer substantially lower performance than Nvidia's restricted hardware. The Huawei Ascend 910B, for instance, delivers approximately 320 TFLOPS of FP16 performance — roughly comparable to an Nvidia A100, a chip that predates the most stringent export controls. The gap to Nvidia's current Blackwell generation is substantial.

What Meituan and other Chinese labs have demonstrated is that the gap in per-chip performance can be partially compensated by:

Scale: deploying clusters of 50,000 to 100,000 domestic chips rather than the smaller, denser clusters that Nvidia hardware enables
Architectural efficiency: MoE designs that activate only a fraction of total parameters per token, reducing the effective compute requirement per inference
Software optimisation: custom communication libraries and training frameworks (such as Huawei's MindSpore and CANN toolchain) that extract maximum utilisation from available hardware
Data quality: investing in curated, high-quality training corpora rather than relying on raw compute to compensate for data noise

The result is not parity with the Western frontier — GPT-5.6 Sol and Claude Fable 5 remain ahead on the most demanding reasoning benchmarks. But it is close enough to be commercially viable, and the trajectory is improving faster than the export-control regime can adapt.

What This Means for Western Policy

The policy implications are uncomfortable. If frontier-scale AI training is achievable on domestic Chinese hardware — even at a performance discount — then the export-control strategy has not prevented China from developing capable AI. It has imposed costs and delays, but it has also accelerated China's investment in domestic semiconductor capability, creating a long-term strategic liability for the United States.

SiliconAngle's analysis↗ noted that while Nvidia still maintains significant market share in China through grey-market channels and older hardware, the success of models like LongCat-2.0 signals a shrinking window for U.S. hardware dominance. As domestic capacity increases and dependence on Western GPU clusters diminishes, the leverage that export controls provide will erode.

Implications for Developers and Enterprises

For developers outside China, LongCat-2.0 presents a straightforward proposition: a frontier-competitive coding model, available via OpenRouter↗ and Meituan's own API endpoints, at prices that undercut the Western frontier by a factor of three to four. The MIT licence means it can be used commercially without restriction, and the model's two-month track record as "Owl Alpha" provides a degree of real-world validation that most newly released models lack.

The caveats are real. The model weights are not yet available for self-hosting — Meituan has open-sourced the architecture and released API access, but the full weights are listed as "coming soon." For enterprises with data sovereignty requirements or latency constraints that preclude API-based inference, this limits immediate utility. There are also the standard concerns about supply-chain trust that apply to any model from a Chinese developer operating under Chinese data laws.

For the broader AI industry, the more significant implication is competitive. The Western frontier labs have spent the past two years arguing, implicitly or explicitly, that their models justify premium pricing because they are uniquely capable. LongCat-2.0's SWE-bench Pro score of 59.5 — edging past GPT-5.5's 58.6 — challenges that narrative directly on the benchmark that matters most to the developer community. Whether the gap holds under independent evaluation remains to be seen. But the direction of travel is clear: the frontier is no longer a Western monopoly, and the pricing power that monopoly conferred is under pressure from an unexpected direction.

A food delivery company from Beijing, training on chips that Washington tried to keep out of reach, has just made the most consequential open-source AI release of the year. The gilded cage of government-gated frontier AI that OpenAI and Anthropic are navigating in Washington looks rather different from the vantage point of a 50,000-chip cluster in Shenzhen.

---

*Elena Vance is Neuron's Frontier Correspondent, based in London.*

#China AI#Meituan#LongCat-2.0#Export Controls#Open Source#Agentic AI#Semiconductor Policy#Frontier Models

Links & Resources

External links — opens in a new tab

Meituan open sources LongCat-2.0, the 1.6T near-frontier agentic coding model trained entirely on Chinese chips — VentureBeatventurebeat.com

LongCat-2.0 Official Project Documentation — LongCat AIlongcatai.org

China Debuts Biggest AI Model Trained on Local Chips, as Meituan Releases LongCat-2.0 — South China Morning Postscmp.com

LongCat-2.0: China's Most Unexpected Frontier Model — Geopolitechsgeopolitechs.org

Meituan Open Sources Massive LongCat-2.0 AI Model Trained on Domestic Chips — SiliconAnglesiliconangle.com

Owl Alpha on OpenRouter Revealed as Meituan's LongCat-2.0 Preview — Decryptdecrypt.co

Owl Alpha on OpenRouter Revealed as Meituan's LongCat-2.0 — KuCoin Newskucoin.com

Meituan Trains 1.6T Param AI on China Chips — Nation Pressnationpress.com

Zero Foreign Chips: China's Food Delivery Giant Meituan Serves a Massive Open-Source AI Model — NDTV Profitndtvprofit.com

Meituan Unveils AI Model LongCat as Local Services Battle Shifts to Technology Front — China Biz Insiderchinabizinsider.com

China Trained Frontier AI Model GLM-5 Without Nvidia — Let's Data Scienceletsdatascience.com

Grok 4.5 Enters Private Beta at SpaceX and Tesla — TechTimestechtimes.com

Anthropic Claude Science AI Workbench — Anthropicanthropic.com

LongCat-2.0 Stealth AI — Yahoo Techtech.yahoo.com

OpenRouter — Model Routing Platformopenrouter.ai

Elena Vance

🇬🇧 Frontier Correspondent · London, UK

Watches the frontier labs and reads research papers so you don’t have to.

Topological Invariants and Differential Topology

by Richard Murdoch Montgomery

A treatise on smooth manifolds, characteristic classes, and cohomology — topological methods applied to physics and data science.

Buy on Amazon →

Electrophysiological Biomarkers of Neuropsychiatric Brain Dynamics Vol 1

by Richard Murdoch Montgomery

EEG-based biomarkers for schizophrenia and bipolar disorder — frequency band power, event-related potentials, and neural connectivity patterns.

Buy on Amazon →

The Future of Scientific Discourse

by Richard Murdoch Montgomery

Transparent, AI-augmented peer review models for the 21st century — open science, reproducibility, and the democratisation of knowledge.

Buy on Amazon →

The Casio fx-CG50: A Comprehensive Academic Treatise

by Richard Murdoch Montgomery

A 223-page deep dive into hardware architecture, statistical analysis, matrix operations, and Casio BASIC programming.

Buy on Amazon →

Comments

Open discussion — no account needed. Be respectful.

Loading comments…

More from Main AI News

While the Giants Sleep: Mistral and Meta Are Rewriting the Rules of AI Specialization

With OpenAI, Anthropic, and Google DeepMind in a rare holiday lull, Mistral AI and Meta have seized the narrative with two radically different bets on the future: a 119-billion-parameter proof-engineering machine and a physics-aware video model that could redefine embodied intelligence. Here's why the quietest week of the year just became the most strategically revealing.

Marcus Okafor

Jul 4, 2026 13m

Alibaba Bans Anthropic AI, Citing 'Backdoor' Risks in Escalating US-China Tech Feud

Chinese tech giant Alibaba has ordered a company-wide ban on all Anthropic AI tools, effective July 10, 2026, citing 'backdoor' security risks in Claude Code — a counterpunch to Anthropic's explosive allegation that Alibaba-linked operators ran the largest known AI model distillation attack in history. The feud exposes deep fractures in the global AI supply chain and forces a reckoning on enterprise AI trust, security, and procurement worldwide.

Marcus Okafor

Jul 3, 2026 9m

The Sovereign-AI Bargain: Why OpenAI's 5% Offer to Washington Changes Everything

Bloomberg's report that OpenAI has discussed giving the U.S. government a five per cent equity stake is not a story about corporate finance — it is the formal consecration of a new kind of political entity: the quasi-national AI champion. The sovereign-AI bargain has found its share price, and the implications for every developer, enterprise, and policymaker are profound.

Elena Vance

Jul 3, 2026 10m