Overview
Gemma 3 is Google's family of open-weight models, released March 12, 2025 and built from the same research and technology behind Gemini 2.0. It ships in four sizes — 1B, 4B, 12B and 27B parameters — so you can pick the one that fits your hardware, from a phone-scale 1B up to a 27B model that runs on a single NVIDIA H100. A compact 270M size was added in August 2025.
Unlike Gemma 2, Gemma 3 is multimodal: the 4B, 12B and 27B variants accept both image and text input (via a 400M-parameter SigLIP vision encoder) and generate text, while the 1B variant is text-only. The context window grew to 128K tokens for the 4B and larger sizes (32K for 1B), and the models support over 140 languages. Architecturally, Gemma 3 interleaves local sliding-window attention with global attention at a 5:1 ratio to keep long-context memory use manageable.
Despite its size, the instruction-tuned 27B model is competitive with much larger systems: it scored 67.5% on MMLU-Pro and 89.0% on MATH, and reached an LMArena Elo around 1338 — ahead of several open models many times its parameter count. Weights are downloadable from Hugging Face, Kaggle and Ollama under Google's custom Gemma Terms of Use (which permits commercial use with attribution), and the models are also served by API providers such as OpenRouter and Google AI Studio.
| Released | 2025-03-12 |
|---|---|
| License | Gemma Terms of Use |
| Weights | Open weights |
| Parameters | 1B · 4B · 12B · 27B (270M added Aug 2025) |
| Context | 128K |
| Max output | 8K |
| Architecture | Dense transformer with 5:1 interleaved local/global attention (1024-token sliding-window local layers); 4B/12B/27B add a 400M SigLIP vision encoder. |
| Knowledge cutoff | August 2024 |
| Modalities | Text, Vision |
| Status | Generally available |
Benchmarks
- MMLU-Pro (27B IT)67.5%
- GSM8K (27B IT)95.9%
- MATH (27B IT)89%
- HumanEval (27B IT)87.8%
- GPQA Diamond (27B IT)42.4%
- LiveCodeBench (27B IT)29.7%
- MMMU val (27B IT, vision)64.9%
- Global-MMLU-Lite (27B IT)75.1%
Scores on a 0–100 scale (25-point gridlines); higher is better. Each benchmark links to its published source.
Pricing
| Input | $0.08 per million tokens |
|---|---|
| Output | $0.16 per million tokens |
Weights are free to download; this is a representative served price for the 27B (OpenRouter). Self-hosting cost depends on your own hardware.
Strengths
- Strong capability-per-parameter — the 27B runs on a single H100 yet rivals far larger models
- Multimodal: 4B/12B/27B accept image + text input via a SigLIP vision encoder
- Large 128K-token context (32K for the 1B size) for long documents
- Broad multilingual coverage — over 140 languages
- Open weights downloadable from Hugging Face, Kaggle and Ollama for local and commercial use
- Four sizes (1B–27B) spanning on-device to single-GPU deployment, plus a 270M size added Aug 2025
Best for
- Self-hosted chat and assistants where data must stay local
- On-device and edge inference with the 1B/4B sizes
- Multimodal tasks: image understanding, document/chart Q&A (4B+)
- Multilingual applications across 140+ languages
- Long-context summarization and retrieval over large inputs
- Fine-tuning a small open base for a domain-specific task
How to access
| Provider | Model ID |
|---|---|
| Google AI Studio / Gemini API ↗ | gemma-3-27b-it |
| OpenRouter ↗ | google/gemma-3-27b-it |
| Hugging Face ↗ | google/gemma-3-27b-it |
| Ollama ↗ | gemma3:27b |
Gemma (open weights) — every version
The full lineage of the Gemma (open weights) line, newest first. Every version has its own page — click any to compare specs, benchmarks and pricing.
FAQ
What sizes does Gemma 3 come in?
Gemma 3 launched in 1B, 4B, 12B and 27B parameter sizes; a compact 270M size was added in August 2025. The 27B is the flagship and runs on a single NVIDIA H100 GPU.
Is Gemma 3 multimodal?
Yes for the 4B, 12B and 27B sizes, which accept both image and text input via a 400M SigLIP vision encoder and output text. The 1B size is text-only.
What is Gemma 3's context window and knowledge cutoff?
The 4B, 12B and 27B models have a 128K-token context window (the 1B is 32K). Training data has a knowledge cutoff of August 2024.
Is Gemma 3 free and open source?
The weights are free to download from Hugging Face, Kaggle and Ollama, but the license is Google's custom Gemma Terms of Use rather than a standard open-source license. It permits commercial use with attribution and some restrictions, so it is best described as open-weight.