Gemma 3

Name: Gemma 3
Author: Google

Google's open multimodal model — 1B/4B/12B/27B sizes, 128K context, 140+ languages, runs on a single GPU.

Overview

Gemma 3 is Google's family of open-weight models, released March 12, 2025 and built from the same research and technology behind Gemini 2.0. It ships in four sizes — 1B, 4B, 12B and 27B parameters — so you can pick the one that fits your hardware, from a phone-scale 1B up to a 27B model that runs on a single NVIDIA H100. A compact 270M size was added in August 2025.

Unlike Gemma 2, Gemma 3 is multimodal: the 4B, 12B and 27B variants accept both image and text input (via a 400M-parameter SigLIP vision encoder) and generate text, while the 1B variant is text-only. The context window grew to 128K tokens for the 4B and larger sizes (32K for 1B), and the models support over 140 languages. Architecturally, Gemma 3 interleaves local sliding-window attention with global attention at a 5:1 ratio to keep long-context memory use manageable.

Despite its size, the instruction-tuned 27B model is competitive with much larger systems: it scored 67.5% on MMLU-Pro and 89.0% on MATH, and reached an LMArena Elo around 1338 — ahead of several open models many times its parameter count. Weights are downloadable from Hugging Face, Kaggle and Ollama under Google's custom Gemma Terms of Use (which permits commercial use with attribution), and the models are also served by API providers such as OpenRouter and Google AI Studio.

Released	2025-03-12
License	Gemma Terms of Use
Weights	Open weights
Parameters	1B · 4B · 12B · 27B (270M added Aug 2025)
Context	128K
Max output	8K
Architecture	Dense transformer with 5:1 interleaved local/global attention (1024-token sliding-window local layers); 4B/12B/27B add a 400M SigLIP vision encoder.
Knowledge cutoff	August 2024
Modalities	Text, Vision
Status	Generally available

Benchmarks

Scores on a 0–100 scale (25-point gridlines); higher is better. Each benchmark links to its published source.

Pricing

Input	$0.08 per million tokens
Output	$0.16 per million tokens

Weights are free to download; this is a representative served price for the 27B (OpenRouter). Self-hosting cost depends on your own hardware.

Pricing source ↗

Strengths

Strong capability-per-parameter — the 27B runs on a single H100 yet rivals far larger models
Multimodal: 4B/12B/27B accept image + text input via a SigLIP vision encoder
Large 128K-token context (32K for the 1B size) for long documents
Broad multilingual coverage — over 140 languages
Open weights downloadable from Hugging Face, Kaggle and Ollama for local and commercial use
Four sizes (1B–27B) spanning on-device to single-GPU deployment, plus a 270M size added Aug 2025

Best for

Self-hosted chat and assistants where data must stay local
On-device and edge inference with the 1B/4B sizes
Multimodal tasks: image understanding, document/chart Q&A (4B+)
Multilingual applications across 140+ languages
Long-context summarization and retrieval over large inputs
Fine-tuning a small open base for a domain-specific task

How to access

Provider	Model ID
Google AI Studio / Gemini API ↗	`gemma-3-27b-it`
OpenRouter ↗	`google/gemma-3-27b-it`
Hugging Face ↗	`google/gemma-3-27b-it`
Ollama ↗	`gemma3:27b`

Gemma (open weights) — every version

The full lineage of the Gemma (open weights) line, newest first. Every version has its own page — click any to compare specs, benchmarks and pricing.

Version	Released	Context	License
Gemma 4current	2026-04-02	—	Apache-2.0
Gemma 3n	2025-06-26	—	Open weights
Gemma 3	2025-03-12	—	Open weights
Gemma 2	2024-06-27	—	Open weights
Gemma 1	2024-02-21	—	Open weights

FAQ

What sizes does Gemma 3 come in?

Gemma 3 launched in 1B, 4B, 12B and 27B parameter sizes; a compact 270M size was added in August 2025. The 27B is the flagship and runs on a single NVIDIA H100 GPU.

Is Gemma 3 multimodal?

Yes for the 4B, 12B and 27B sizes, which accept both image and text input via a 400M SigLIP vision encoder and output text. The 1B size is text-only.

What is Gemma 3's context window and knowledge cutoff?

The 4B, 12B and 27B models have a 128K-token context window (the 1B is 32K). Training data has a knowledge cutoff of August 2024.

Is Gemma 3 free and open source?

The weights are free to download from Hugging Face, Kaggle and Ollama, but the license is Google's custom Gemma Terms of Use rather than a standard open-source license. It permits commercial use with attribution and some restrictions, so it is best described as open-weight.

// Overview

// Benchmarks

// Pricing

// Strengths

// Best for

// How to access

// Gemma (open weights) — every version

// FAQ