Google Gemma 4 AI model on a circuit board, highlighting its low memory and near-zero latency for developers.

Editorial illustration for Google releases Gemma 4 under Apache 2.0, noting lower memory, near‑zero latency

Gemma 4: Google's Lean Open-Source AI Model Unleashed

Google releases Gemma 4 under Apache 2.0, noting lower memory, near‑zero latency

April 2, 2026 • 2 min read

Google has put its latest Gemma 4 models into the open‑source arena, moving them to an Apache 2.0 licence and promising a tighter fit for everyday machines. The shift follows the earlier Gemma 3 release, which already let developers run large language models on laptops and phones. This time, Google says the new series trims the resource demands even further while pushing inference speed to a point where delays are barely perceptible.

The company frames the upgrade as the most capable option anyone can host locally, positioning it as a practical alternative to cloud‑only offerings. For developers weighing memory footprints, battery drain and responsiveness, those claims matter. Here’s the company’s own wording on why Gemma 4 should feel noticeably different.

Not only do they use less memory and battery than Gemma 3, but Google also touts "near-zero latency" this time around. More powerful, more open All the new Gemma 4 models will reportedly leave Gemma 3 in the dust--Google claims these are the most capable models you can run on your local hardware. Google says Gemma 31B will debut at number three on the Arena list of top open AI models, behind GLM-5 and Kimi 2.5. However, even the biggest Gemma 4 variant is a fraction of the size of those models, making it theoretically much cheaper to run.

Google announces Gemma 4 open AI models, switches to Apache 2.0 license - Ars Technica AI

Google’s Gemma 4 arrives under an Apache 2.0 licence, a clear shift from the proprietary terms that have governed its Gemini siblings. Four model sizes are now advertised for local deployment, each tuned to consume less memory and battery than the year‑old Gemma 3. Google also touts “near‑zero latency,” positioning the suite as the most capable open‑weight option for on‑device inference.

The licensing change directly addresses developer complaints about restrictive AI terms, offering broader freedom to experiment and integrate. Yet the claim that these models will “leave Gemma 3 in the dust” rests on internal benchmarks; independent verification is still pending. Likewise, the promise of “most capable” performance on local hardware is compelling, but whether the improvements translate across diverse workloads remains unclear.

In practice, developers now have a set of lighter, faster models they can run without cloud dependencies, and the open licence removes a barrier that has frustrated many. Whether the combination of lower resource demands and the new licence will drive wider adoption is something the community will have to observe.

Common Questions Answered

How does Gemma 4 improve upon the previous Gemma 3 models in terms of performance?

Gemma 4 offers significant improvements by reducing memory and battery consumption compared to Gemma 3. Google claims these new models provide near-zero latency and are more powerful, with the Gemma 31B model expected to rank third on the Arena list of top open AI models.

What licensing approach is Google using for the Gemma 4 models?

Google has released Gemma 4 under the Apache 2.0 license, which is a significant departure from the proprietary terms used for its Gemini models. This open licensing approach addresses developer concerns and provides more flexibility for use and deployment of the AI models.

What are the deployment capabilities of the Gemma 4 models?

Google has designed Gemma 4 to be highly deployable on local hardware, with four different model sizes available for on-device inference. The models are specifically optimized to run efficiently on laptops, phones, and other everyday machines with minimal resource requirements.

🎓

Featured Review

No Code MBA

Build AI apps without coding. Our in-depth course review.

Read Review

Gemma 4: Google's Lean Open-Source AI Model Unleashed

Further Reading

Common Questions Answered

How does Gemma 4 improve upon the previous Gemma 3 models in terms of performance?

What licensing approach is Google using for the Gemma 4 models?

What are the deployment capabilities of the Gemma 4 models?

Most Popular

Meta's structured prompting lifts LLM code review accuracy to 93%

Greg Brockman says GPT reasoning models have line of sight to AGI

Anthropic's Claude Code includes Kairos daemon that runs after window closes

Elgato adds MCP support in Stream Deck 7.4 update, enabling new trigger method

RFK Jr. urges Americans to use banned peptide drugs popular with influencers

Developers fine‑tune Gemma 4 on‑device with NVIDIA NeMo Automodel

EU bans AI‑generated content in official communications, cites authenticity

Kilo launches KiloClaw to secure enterprise AI agents at scale

CaP-Agent0 Beats Human Code on 4 of 7 Robot Tasks Using Low‑Level Blocks

Dorsey says AI can replace managers with remote‑work data, but trust lags

Further Reading

Related Reading

UK PM vows action on Grok's deepfake scandal, Starmer condemns X

GPT-5 helps mathematicians offload tedious tasks, says Timothy Gowers

India proposes licensing and royalty rules for AI training by Google, OpenAI

Gemini 3 Pro builds screenshot-to-code app in two prompts, fixes bugs

Gemini 3 Pro and GPT-5 stumble on graduate-level physics benchmark

Depression‑detecting AI team rejects USD 50,000‑a‑week offer, opts to open‑source

Qwen3.5-Omni writes code from spoken instructions, fixes voice token lag

Google Home update lets Gemini adjust lighting, appliances via natural language

DeepMind study finds six traps that let a few poisoned docs hijack AI agents

Common Questions Answered

How does Gemma 4 improve upon the previous Gemma 3 models in terms of performance?

What licensing approach is Google using for the Gemma 4 models?

What are the deployment capabilities of the Gemma 4 models?

Most Popular

Meta's structured prompting lifts LLM code review accuracy to 93%

Greg Brockman says GPT reasoning models have line of sight to AGI

Anthropic's Claude Code includes Kairos daemon that runs after window closes

Elgato adds MCP support in Stream Deck 7.4 update, enabling new trigger method

RFK Jr. urges Americans to use banned peptide drugs popular with influencers

Developers fine‑tune Gemma 4 on‑device with NVIDIA NeMo Automodel

EU bans AI‑generated content in official communications, cites authenticity

Kilo launches KiloClaw to secure enterprise AI agents at scale

CaP-Agent0 Beats Human Code on 4 of 7 Robot Tasks Using Low‑Level Blocks

Dorsey says AI can replace managers with remote‑work data, but trust lags