Qwen team members, a diverse group of researchers, proudly present their Qwen3.6-35B-A3B Vision-Language MoE model.

Editorial illustration for Qwen Team Open‑Sources Qwen3.6‑35B‑A3B Vision‑Language MoE Model with 3B Params

Qwen's 35B Vision-Language Model Redefines Open-Source AI

Qwen Team Open‑Sources Qwen3.6‑35B‑A3B Vision‑Language MoE Model with 3B Params

By AI Daily Post Edited by Brian Petersen, Editor-in-Chief

April 17, 2026 • Updated: July 4, 2026 • 4 min read

The Qwen Team just dropped something that makes you sit up and pay attention: Qwen3.6‑35B‑A3B. It’s a vision-language MoE model, but the real story isn’t the 35 billion total parameters, it’s the 3 billion active. That sparse architecture, with its clever interplay of Gated DeltaNet and Gated Attention across 40 layers, pushes efficiency into a new register.

The numbers on SWE‑bench Verified? 73.4, smashing previous models, including its own predecessor. On Terminal‑Bench?

The highest score among all compared models. This isn’t another incremental release; it’s a model built for agents that code, reason, and see the world. Native context of 262K tokens, scalable past a million, and a vision backbone that isn’t an afterthought.

Qwen3.6‑35B‑A3B is open‑source, and it’s serious.

The architecture introduces an unusual hidden layout worth understanding: the model uses a pattern of 10 blocks, each consisting of 3 instances of (Gated DeltaNet → MoE) followed by 1 instance of (Gated Attention → MoE). Across 40 total layers, the Gated DeltaNet sublayers handle linear attention -- a computationally cheaper alternative to standard self-attention -- while the Gated Attention sublayers use Grouped Query Attention (GQA), with 16 attention heads for Q and only 2 for KV, significantly reducing KV-cache memory pressure during inference. The model supports a native context length of 262,144 tokens, extensible up to 1,010,000 tokens using YaRN (Yet another RoPE extensioN) scaling.

Agentic Coding is Where This Model Gets Serious On SWE-bench Verified -- the canonical benchmark for real-world GitHub issue resolution -- Qwen3.6-35B-A3B scores 73.4, compared to 70.0 for Qwen3.5-35B-A3B and 52.0 for Gemma4-31B. On Terminal-Bench 2.0, which evaluates an agent completing tasks inside a real terminal environment with a three-hour timeout, Qwen3.6-35B-A3B scores 51.5 -- the highest among all compared models, including Qwen3.5-27B (41.6), Gemma4-31B (42.9), and Qwen3.5-35B-A3B (40.5). On QwenWebBench, an internal bilingual front-end code generation benchmark covering seven categories including Web Design, Web Apps, Games, SVG, Data Visualization, Animation, and 3D, Qwen3.6-35B-A3B achieves a score of 1397 -- well ahead of Qwen3.5-27B (1068) and Qwen3.5-35B-A3B (978).

On STEM and reasoning benchmarks, the numbers are equally striking. Qwen3.6-35B-A3B scores 92.7 on AIME 2026 (the full AIME I & II), and 86.0 on GPQA Diamond -- a graduate-level scientific reasoning benchmark -- both competitive with much larger models. Multimodal Vision Performance Qwen3.6-35B-A3B is not a text-only model.

Qwen Team Open-Sources Qwen3.6-35B-A3B: A Sparse MoE Vision-Language Model with 3B Active Parameters and Agentic Coding Capabilities - MarkTechPost

Here is the conclusion: This model is not a compromise. It is a recalibration. By packing a 35B-parameter MoE into just 3B active parameters, the Qwen Team has built something that punches decisively above its weight class, scoring within striking distance of models orders of magnitude larger on AIME 2026 and GPQA Diamond, while demolishing benchmarks in agentic coding and real terminal tasks.

The architecture itself is a quiet revolution: the interplay of Gated DeltaNet for cheap linear attention and GQA for memory-efficient inference, layered across 40 blocks, lets the model chew through 262K tokens natively and scale to a million with YaRN. That is not just clever engineering. It is a deliberate bet on what matters most for practical intelligence, long-context reasoning, tool use, and vision-language fluency.

Qwen3.6‑35B‑A3B is open-source, available now. Teams that were priced out of frontier models now have a tool that rivals the best on SWE-bench and Terminal-Bench, without the datacenter bill. The takeaway is stark: efficiency is no longer a trade-off.

It is the new ceiling. What comes next is not a bigger model, it’s a smarter deployment of what we already have.

Common Questions Answered

How does the Qwen3.6-35B-A3B model achieve computational efficiency?

The model uses a Mixture-of-Experts (MoE) design that activates only 3 billion parameters during inference, despite having a total of 35 billion parameters. This approach allows the model to maintain high performance while significantly reducing computational requirements, making it more efficient than traditional dense models.

What unique architectural features distinguish the Qwen3.6-35B-A3B model?

The model features a distinctive architecture with 10 blocks, each containing 3 instances of (Gated DeltaNet → MoE) and 1 instance of (Gated Attention → MoE). It uses linear attention and Grouped Query Attention (GQA) with 16 attention heads for queries and only 2 for keys/values, enabling more efficient processing.

What makes the Qwen3.6-35B-A3B model notable in open-source vision-language systems?

The model introduces advanced 'agentic coding' capabilities, suggesting it can generate and manipulate code more autonomously than typical large language models. As the first open-weight release from Alibaba's Qwen3.6 line, it demonstrates the potential to rival dense models ten times its size in performance.

Ship an AI product this weekend — no engineers required.

Structured, in-depth lessons on the exact no-code tools — not scattered tutorials.

The exact platforms, taught in depth
Build real, working projects
Our honest review + a reader discount

Read the review →

Qwen's 35B Vision-Language Model Redefines Open-Source AI

Common Questions Answered

How does the Qwen3.6-35B-A3B model achieve computational efficiency?

What unique architectural features distinguish the Qwen3.6-35B-A3B model?

What makes the Qwen3.6-35B-A3B model notable in open-source vision-language systems?

Further Reading

Ship an AI product this weekend — no engineers required.

Latest News

Bristol Myers Squibb Builds SuperDuperPOD on NVIDIA Vera Rubin

Microsoft adds AMD-powered Azure HXv2 for complex chip design workloads

Hugging Face Says AI Agent Hacked Its Platform, Used AI to Fight Back

OpenAI's Miles Wang in Talks for USD 2B AI Drug Discovery Startup

Mistral Vibe for Code Leads in Multi-Agent Programming Benchmark

OpenAI's First Hardware Device Is a Movable, Screenless Speaker

PrismML's Bonsai 27B Runs Qwen3.6 on Laptops With 1-bit and Ternary Builds

OpenAI Targets 2027 for First Major Hardware: A ChatGPT Speaker

Publishers sue Google over unauthorized AI book training

Anthropic's Claude for Teachers Vows Not to Train on Student Data

Related Reading

ChatGPT's 'Nerdy' tweak rewards goblin metaphors in answers, study finds

Google tests visual 'magazine-style' UI for Gemini 3 Pro users

AI Engineers Face Rising Costs, Need New Strategies for Efficiency

91% of businesses now use video marketing — AI cut the cost of keeping up by 91% too

OpenAI updates Codex to run apps in background, targeting Anthropic's Claude Code

Gemini app uses Google Photos to generate personalized images

Common Questions Answered

How does the Qwen3.6-35B-A3B model achieve computational efficiency?

What unique architectural features distinguish the Qwen3.6-35B-A3B model?

What makes the Qwen3.6-35B-A3B model notable in open-source vision-language systems?

Further Reading

Ship an AI product this weekend — no engineers required.

Latest News

Bristol Myers Squibb Builds SuperDuperPOD on NVIDIA Vera Rubin

Microsoft adds AMD-powered Azure HXv2 for complex chip design workloads

Hugging Face Says AI Agent Hacked Its Platform, Used AI to Fight Back

OpenAI's Miles Wang in Talks for USD 2B AI Drug Discovery Startup

Mistral Vibe for Code Leads in Multi-Agent Programming Benchmark

OpenAI's First Hardware Device Is a Movable, Screenless Speaker

PrismML's Bonsai 27B Runs Qwen3.6 on Laptops With 1-bit and Ternary Builds

OpenAI Targets 2027 for First Major Hardware: A ChatGPT Speaker

Publishers sue Google over unauthorized AI book training

Anthropic's Claude for Teachers Vows Not to Train on Student Data