Editorial illustration for Xiaomi Unveils MiMo-V2-Flash AI Model with 150 Tokens/Sec at Low Cost

Xiaomi's MiMo-V2-Flash: 150 Tokens/Sec AI Model Breakthrough

Xiaomi launches MiMo-V2-Flash AI model: 150 t/s, USD 0.1-USD 0.3 per million tokens

December 18, 2025 • Updated: January 12, 2026 • 2 min read

In the rapidly evolving world of open-source AI, Xiaomi is making bold moves to challenge established players. The Chinese tech giant has quietly developed a new AI model that could shake up the inference speed and cost dynamics of language models.

Enter MiMo-V2-Flash, Xiaomi's latest technological offering that promises to deliver high-performance AI at a fraction of current market prices. Designed to compete with specialized AI models, this new system targets developers and businesses looking for efficient, cost-effective machine learning solutions.

The model's potential lies not just in its technical specifications, but in its strategic positioning. By targeting a sweet spot between performance and affordability, Xiaomi seems to be signaling its serious intent to become a significant player in the global AI landscape.

But how exactly does MiMo-V2-Flash stack up against existing models? The numbers suggest an intriguing story of speed and economic efficiency that could turn heads in the tech community.

According to Xiaomi, the model delivers inference speeds of up to 150 tokens per second and operates at a low cost of $0.1 per million input tokens and $0.3 per million output tokens. On benchmarks, Xiaomi claimed MiMo-V2-Flash achieves performance comparable to Moonshot AI's Kimi K2 Thinking and DeepSeek V3.2 across most reasoning tests, while surpassing Kimi K2 in long-context evaluations. In agentic tasks, the model scored 73.4% on SWE-Bench Verified, outperforming all open-source peers and approaching OpenAI's GPT-5-High.

Xiaomi also said it matches Anthropic's Claude 4.5 Sonnet on coding tasks at a fraction of the cost. MiMo-V2-Flash uses a Mixture-of-Experts architecture to split large neural networks and has 309 billion parameters, allowing it to balance performance and efficiency. It allows Xiaomi engineers to work on architectural optimisations, significantly reducing the cost of processing long prompts by limiting how much past context the model needs to re-evaluate.

Luo Fuli, a former DeepSeek researcher who recently joined Xiaomi's MiMo team, described the release as "step two on our AGI roadmap" in a post on X, referring to artificial general intelligence.

Xiaomi Enters AI Race With Open-Source Model MiMo-V2-Flash - Analytics India Magazine

Xiaomi's latest AI model, MiMo-V2-Flash, signals the company's serious push into generative AI technology. The model's standout features,150 tokens per second and rock-bottom pricing between $0.1 and $0.3 per million tokens, suggest a strategic approach to making advanced AI more accessible.

Performance benchmarks paint an intriguing picture. Xiaomi claims MiMo-V2-Flash matches top-tier models like Moonshot AI's Kimi K2 and DeepSeek V3.2 across reasoning tests, with a notable edge in long-context evaluations. Its 73.4% score on SWE-Bench Verified is particularly impressive, outperforming current open-source alternatives.

The low-cost, high-speed model hints at Xiaomi's ambition to democratize AI technology. By offering competitive performance at a fraction of existing market rates, the company could potentially disrupt the current AI model landscape. Still, real-world performance will ultimately determine whether MiMo-V2-Flash becomes a meaningful player in the rapidly evolving generative AI space.

Common Questions Answered

How fast is the inference speed of Xiaomi's MiMo-V2-Flash AI model?

The MiMo-V2-Flash AI model delivers an impressive inference speed of up to 150 tokens per second. This high-performance capability positions the model as a competitive option in the rapidly evolving open-source AI landscape.

What are the pricing details for Xiaomi's MiMo-V2-Flash model?

Xiaomi offers the MiMo-V2-Flash at a remarkably low cost of $0.1 per million input tokens and $0.3 per million output tokens. These competitive pricing rates are designed to make advanced AI technology more accessible to developers and businesses.

How does MiMo-V2-Flash perform on benchmarks compared to other AI models?

According to Xiaomi, the MiMo-V2-Flash achieves performance comparable to Moonshot AI's Kimi K2 Thinking and DeepSeek V3.2 across most reasoning tests. The model particularly excels in long-context evaluations and scored an impressive 73.4% on SWE-Bench Verified in agentic tasks, outperforming other open-source peers.

🎓

Featured Review

No Code MBA

Build AI apps without coding. Our in-depth course review.

Read Review

Xiaomi's MiMo-V2-Flash: 150 Tokens/Sec AI Model Breakthrough

Further Reading

Common Questions Answered

How fast is the inference speed of Xiaomi's MiMo-V2-Flash AI model?

What are the pricing details for Xiaomi's MiMo-V2-Flash model?

How does MiMo-V2-Flash perform on benchmarks compared to other AI models?

Most Popular

Gemini helps create 7‑day low‑cost meal plan for USD 200 grocery budget

Shared memory adds documented actions for transparent AI orchestration

AI agents launch dedicated social network as GitLab showcases roadmap

Musk’s Grok still offers free image-editing tools that can undress men

OpenClaw launches ‘Moltbook’ social network for its AI agents

AI‑skilled freshers with workflow automation earn 35‑40% more, up to Rs 22 LPA

Enterprises Misjudge RAG Metrics as Freshness Failures Stem from Source Changes

Firefox adds toggle to disable AI features, matching Edge and Chrome

Musk merges SpaceX with xAI and X, cites new AI‑compute satellite plan

AI aids cross‑breeding to curb decline and genetic loss in endangered species

Further Reading

Related Reading

UK PM vows action on Grok's deepfake scandal, Starmer condemns X

GPT-5 helps mathematicians offload tedious tasks, says Timothy Gowers

India proposes licensing and royalty rules for AI training by Google, OpenAI

OpenAI signs USD 10 bn deal to use Amazon Trainium3 chips, 4.4× faster compute

Zepto Cafe uses MCP to parse text orders, Playwright runs the clicks

Common Questions Answered

How fast is the inference speed of Xiaomi's MiMo-V2-Flash AI model?

What are the pricing details for Xiaomi's MiMo-V2-Flash model?

How does MiMo-V2-Flash perform on benchmarks compared to other AI models?

Most Popular

Gemini helps create 7‑day low‑cost meal plan for USD 200 grocery budget

Shared memory adds documented actions for transparent AI orchestration

AI agents launch dedicated social network as GitLab showcases roadmap

Musk’s Grok still offers free image-editing tools that can undress men

OpenClaw launches ‘Moltbook’ social network for its AI agents

AI‑skilled freshers with workflow automation earn 35‑40% more, up to Rs 22 LPA

Enterprises Misjudge RAG Metrics as Freshness Failures Stem from Source Changes

Firefox adds toggle to disable AI features, matching Edge and Chrome

Musk merges SpaceX with xAI and X, cites new AI‑compute satellite plan

AI aids cross‑breeding to curb decline and genetic loss in endangered species