Editorial illustration for Alibaba open-sources Qwen3.5-Medium models with Sonnet 4.5 performance locally
Qwen 3.5: Tiny AI Models Beat Massive Predecessors
Alibaba open-sources Qwen3.5-Medium models with Sonnet 4.5 performance locally
Why does this matter now? While most large‑language models still demand cloud‑grade hardware, Alibaba’s latest Qwen 3.5‑Medium series claims to deliver Sonnet 4.5‑level results on a typical workstation. The company says the new line runs “Thinking Mode” by default, pausing before it spits out a final answer to mimic a more deliberate reasoning process.
But the headline isn’t just about speed; it’s about access. By putting a 35‑billion‑parameter base model and its instruction‑tuned siblings into the public domain, Alibaba is betting that researchers and developers will tinker locally rather than rely on proprietary APIs. Here’s the thing: open‑sourcing such a sizable model could lower the barrier for experiments that previously required expensive cloud credits.
The move also hints at a shift toward models that prioritize internal deliberation over raw output. All of that sets the stage for the official statement below.
Base Model Release: In a move to support the research community, Alibaba has open‑sourced the Qwen3.5-35B-A3B-Base model alongside the instruct‑tuned versions. Product: Intelligence that 'thinks' first.
Base Model Release: In a move to support the research community, Alibaba has open-sourced the Qwen3.5-35B-A3B-Base model alongside the instruct-tuned versions. Product: Intelligence that 'thinks' first Qwen 3.5 introduces a native "Thinking Mode" as its default state. Before providing a final answer, the model generates an internal reasoning chain--delimited by
tags--to work through complex logic.The product lineup is tailored for varying hardware environments: Qwen3.5-27B: Optimized for high efficiency, supporting a context length of over 800K tokens. Qwen3.5-Flash: The production-grade hosted version, featuring a default 1 million token context length and built-in official tools.
Alibaba’s Qwen 3.5‑Medium series arrives as a quartet of open‑source LLMs, three of which are cleared for commercial use under Apache 2.0. The models—Qwen3.5‑35B‑A3B, Qwen3.5‑122B‑A10B and Qwen3.5‑27B—are already hosted on Hugging Face and ModelScope, and a base‑model release accompanies the instruct‑tuned variants. They tout “Thinking Mode” as the default, meaning the system pauses to reason before delivering a final answer.
The claim of Sonnet 4.5‑level performance on local hardware is bold, yet no benchmark details accompany the announcement. Will the promised reasoning step translate into measurable gains for everyday developers? And yet, without third‑party validation, the practical impact remains unclear.
Alibaba’s move certainly broadens the pool of freely available large models, but adoption will depend on how well the “Thinking Mode” integrates with existing pipelines and whether the performance holds up across diverse workloads. The community now has the code; time will reveal how useful it proves in real‑world settings.
Further Reading
- Qwen 3.5 - Advanced AI Model Created by Alibaba Cloud - Overchat.ai
- Qwen3-Coder-Next: The Complete 2026 Guide to Running Powerful AI Coding Agents Locally - Dev.to
- Qwen3.5 397B A17B (Reasoning) vs Claude 4.5 Sonnet (Reasoning) - Artificial Analysis
- Qwen3.5 Reviews in 2026 - SourceForge
Common Questions Answered
What makes the Qwen3.5-35B-A3B model unique in Alibaba's new lineup?
The Qwen3.5-35B-A3B is a sparse Mixture-of-Experts (MoE) model with only 3 billion active parameters out of 35 billion total. Remarkably, it outperforms the previous flagship Qwen3-235B-A22B model while being seven times smaller, demonstrating significant efficiency and performance improvements.
How does the new Qwen 3.5 'Thinking Mode' work?
Thinking Mode is the default reasoning approach where the model pauses to generate an internal reasoning chain before providing a final answer. This approach mimics a more deliberate thought process, allowing the model to work through complex logic before delivering its response.
What licensing options are available for the Qwen 3.5 Medium series models?
The Qwen 3.5 Medium series models are released under the Apache 2.0 license, which allows for both personal and commercial use. The models are currently hosted on platforms like Hugging Face and ModelScope, making them easily accessible to researchers and developers.