Google Gemma 4: Open-Source AI Models Under Apache 2.0

On April 2, 2026, Google dropped a major announcement for the open-source AI community: Gemma 4, a new family of four open-weight models built directly on the same research and technology powering its proprietary Gemini 3 models. But what's generating the most buzz isn't just the performance numbers — it's the license change. Gemma 4 ships under the Apache 2.0 license, a significant shift away from Google's previous restrictive custom Gemma Terms of Use, giving developers, researchers, and businesses genuine freedom to build, modify, and deploy these models however they see fit. As Ars Technica reports, this move marks a pivotal moment in Google's open AI strategy.

What Is Gemma 4? The Four Models Explained

Gemma 4 is not a single model — it's a family of four distinct models spanning a wide range of hardware targets, from smartphones to professional GPU workstations. Here's a breakdown of what Google has released:

Effective 2B (E2B): The smallest model in the family, designed for mobile and edge devices. Despite its compact size, it supports multimodal inputs including images, video, audio, and speech.
Effective 4B (E4B): A slightly larger mobile-first model, also supporting audio inputs alongside images and video. Both the E2B and E4B were developed with input from Google's Pixel hardware team and optimized in partnership with Qualcomm and MediaTek for on-device inference.
26B Mixture of Experts (MoE): A highly efficient large model that activates only 3.8 billion of its 26 billion parameters during inference, enabling faster token generation without sacrificing capability. This model ranked sixth on Arena AI's text leaderboard at launch.
31B Dense: The flagship model of the family, debuting at number three on Arena AI's text leaderboard. It's designed to run unquantized on a single Nvidia H100 80GB GPU and can be quantized to run on consumer-grade hardware.

As Engadget notes, all four models share a common capability: they can process both video and images, making Gemma 4 a fully multimodal open-weight family from top to bottom.

The Apache 2.0 License: Why It Matters for Developers

The technical specs are impressive, but the licensing change is what's turning heads in developer communities. Previous Gemma models shipped under a custom Google license that imposed restrictions on commercial use, redistribution, and fine-tuning. The switch to Apache 2.0 removes nearly all of those barriers.

Under Apache 2.0, developers can:

Use the models commercially without royalty payments
Modify, fine-tune, and redistribute the models freely
Integrate Gemma 4 into proprietary products without open-sourcing their own code
Deploy in enterprise environments without legal ambiguity

This is a direct response to the growing momentum behind truly open models like Meta's LLaMA series, which have captured developer mindshare precisely because of their permissive licensing. Google appears to be signaling that it wants Gemma 4 to be the open foundation model of choice for the next generation of AI applications.

Running Locally: From Smartphones to Workstations

One of Gemma 4's defining design goals is local deployment across the full hardware spectrum. As ZDNet reports, the models are engineered to bring powerful AI to devices without requiring a cloud connection.

For mobile developers, the E2B and E4B models are particularly exciting. Optimized for chips from Qualcomm and MediaTek — the silicon found in most Android devices — these models can enable real-time on-device AI assistants, translation across more than 140 supported languages, and sophisticated audio understanding without any data leaving the device.

For workstation users and hobbyist AI builders, the larger models are accessible too. The 31B Dense model is designed to run unquantized on a single Nvidia H100 80GB GPU, while quantized versions can run on more affordable consumer cards. PCWorld highlights that modern Nvidia RTX GPU-equipped PCs are well-positioned to take advantage of these models, with quantized Gemma 4 variants runnable on mid-range gaming hardware.

Google claims Gemma 4 delivers "an unprecedented level of intelligence-per-parameter," and the benchmark results for the 31B Dense model suggest that claim holds up — at least for a model in its weight class.

Multimodal Capabilities and Offline Code Generation

Every model in the Gemma 4 family supports image and video inputs, making multimodal reasoning a baseline feature rather than a premium add-on. The two smaller models — E2B and E4B — go further by adding audio input and speech understanding, enabling voice-driven applications that run entirely on-device.

As SiliconAngle reports, Google emphasizes that Gemma 4 brings complex reasoning skills to low-power devices — a capability previously reserved for much larger cloud-hosted models. The improvements in math, reasoning, and instruction-following stem directly from the Gemini 3 research that underpins the entire family.

Another capability generating significant interest is offline code generation. Gemma 4 supports local "vibe coding" — the practice of using natural language prompts to generate and iterate on code — without any internet connection required. For developers working in air-gapped environments, on planes, or simply concerned about sending proprietary code to external APIs, this is a meaningful advantage.

Where to Get Gemma 4: Download and Deployment Options

Google has made Gemma 4 available across the most popular model distribution platforms:

Hugging Face: All four models are available for direct download, with support for standard transformer inference pipelines.
Kaggle: Available through Google's data science platform, with integrated notebook environments for quick experimentation.
Ollama: For developers who prefer a simple local deployment experience, Gemma 4 models are available via Ollama's one-command setup — ideal for running the models on a laptop for AI development or local workstation.

The Mixture of Experts architecture of the 26B model is worth understanding for those planning deployments. By activating only 3.8B parameters per forward pass, it achieves inference speeds closer to a 4B model while retaining the knowledge capacity of a much larger one — a compelling option for teams that need a balance between speed and capability on a single Nvidia H100 80GB GPU.

Gemma 4 vs. the Competition: Where Does It Stand?

The open-weight model space is intensely competitive in 2026. Meta's LLaMA series, Mistral's offerings, and various fine-tuned derivatives all compete for developer attention. Gemma 4's debut at number three on Arena AI's text leaderboard for the 31B Dense model is a strong opening statement — particularly for a model that can run on a single GPU.

The key differentiators for Gemma 4 are:

Gemini 3 lineage: Direct knowledge transfer from Google's most capable proprietary research
True Apache 2.0 licensing: No custom terms, no ambiguity
Mobile-first small models: Hardware-optimized for real-world on-device deployment
Full multimodal support across all sizes: Video, image, and (for smaller models) audio
140+ language support: Strong multilingual capability at every parameter scale

Frequently Asked Questions About Gemma 4

Can Gemma 4 run on a consumer GPU?

Yes. While the 31B Dense model is designed to run unquantized on an Nvidia H100 80GB GPU, all models can be quantized to run on consumer Nvidia RTX GPU hardware. The E2B and E4B models are specifically optimized for mobile-class chips from Qualcomm and MediaTek.

Is Gemma 4 truly free to use commercially?

Yes. The Apache 2.0 license permits commercial use, modification, and redistribution with minimal restrictions. This is a major upgrade from the previous Gemma custom license, which imposed significant limitations on commercial applications.

What languages does Gemma 4 support?

Gemma 4 supports more than 140 languages, making it one of the most multilingual open-weight model families available. This extends across all four model sizes.

How does the Mixture of Experts model work?

The 26B MoE model has 26 billion total parameters but only activates 3.8 billion of them for any given inference pass. This sparse activation pattern means it generates tokens at speeds closer to a 4B model while retaining the broader knowledge of a much larger architecture.

Where can I download Gemma 4?

Gemma 4 models are available on Hugging Face, Kaggle, and Ollama. For local deployment on a laptop for AI development, Ollama offers the simplest setup experience.

Conclusion: A Turning Point for Open-Source AI

Gemma 4 represents Google's most serious commitment yet to the open-source AI ecosystem. By combining Gemini 3-level technology with a genuine Apache 2.0 license, a full multimodal feature set, and a hardware range spanning smartphones to professional GPU servers like the Nvidia H100 80GB GPU, Google has released a model family that developers can realistically build on without legal or technical friction.

The benchmark results — third and sixth on Arena AI's text leaderboard for the two large models — suggest the performance is there. The licensing change ensures the freedom is there too. Whether you're building a multilingual mobile app, running offline code generation on a local workstation, or fine-tuning for a specialized enterprise use case, Gemma 4 deserves serious consideration as a foundation model for 2026 and beyond.

Google Gemma 4: Open-Source AI Models Under Apache 2.0

What Is Gemma 4? The Four Models Explained

The Apache 2.0 License: Why It Matters for Developers

Running Locally: From Smartphones to Workstations

Multimodal Capabilities and Offline Code Generation

Where to Get Gemma 4: Download and Deployment Options

Gemma 4 vs. the Competition: Where Does It Stand?

Frequently Asked Questions About Gemma 4

Can Gemma 4 run on a consumer GPU?

Is Gemma 4 truly free to use commercially?

What languages does Gemma 4 support?

How does the Mixture of Experts model work?

Where can I download Gemma 4?

Conclusion: A Turning Point for Open-Source AI

Related Products

Nvidia H100 80GB GPU

Top Rated: Gemma 4

Best Value: Gemma 4

Tech Insider Updates

Sources

More from ScrollWorthy