AI/TLDR

Gemini 1.0 Nano

Google's first on-device Gemini — distilled, 4-bit, shipped on the Pixel 8 Pro.

Overview

Gemini 1.0 Nano is Google's smallest and most efficient Gemini 1.0 model, built to run entirely on-device rather than in the cloud. It launched on December 6, 2023 as part of the original Gemini 1.0 family (Ultra, Pro, Nano), and the Pixel 8 Pro was the first phone engineered to run it — using the TPU on Google's Tensor G3 chip via a new Android system service called AICore. Because inference happens locally, Gemini Nano can work offline and keeps input data on the device.

Per Google's Gemini 1.0 technical report, Nano ships in two sizes: Nano-1 at 1.8 billion parameters (for lower-memory devices) and Nano-2 at 3.25 billion parameters (for higher-memory devices). Both are distilled from the larger Gemini models and 4-bit quantized for deployment. The report describes Nano as excelling at on-device tasks such as summarization, reading comprehension, and text completion, with notably strong factuality (retrieval-related) performance for its size.

At launch on the Pixel 8 Pro, Gemini Nano powered Summarize in Recorder (bullet-point summaries of recordings, offline) and Smart Reply in Gboard (starting as a developer preview, first in WhatsApp). Gemini 1.0 Nano was the foundation of Google's on-device Gemini line, which Google has continued to update with newer, multimodal on-device versions reachable through ML Kit's GenAI APIs on Android and the Prompt API in Chrome.

Released2023-12-06
LicenseProprietary
WeightsAPI only
Parameters1.8B (Nano-1) · 3.25B (Nano-2)
ContextUndisclosed
Max outputUndisclosed
ArchitectureDistilled transformer, 4-bit quantized for on-device inference
ModalitiesText
StatusGenerally available

Benchmarks

  1. BoolQ — Nano-2 (3.25B)79.3%
  2. BoolQ — Nano-1 (1.8B)71.6%
  3. TydiQA (GoldP) — Nano-274.2%
  4. TydiQA (GoldP) — Nano-168.9%
  5. MMLU (5-shot) — Nano-255.8%
  6. MMLU (5-shot) — Nano-145.9%
  7. BIG-Bench-Hard (3-shot) — Nano-242.4%
  8. BIG-Bench-Hard (3-shot) — Nano-134.8%
  9. MBPP (coding) — Nano-227.2%
  10. MBPP (coding) — Nano-120%
  11. MATH (4-shot) — Nano-222.8%
  12. MATH (4-shot) — Nano-113.5%

Scores on a 0–100 scale (25-point gridlines); higher is better. Each benchmark links to its published source.

Strengths

  • Runs fully on-device — works offline and keeps input data on the phone (no cloud round-trip)
  • Strong factuality / retrieval performance for its size (per the Gemini 1.0 technical report)
  • Two sizes (1.8B Nano-1, 3.25B Nano-2) to fit low- and high-memory devices
  • 4-bit quantized and distilled from larger Gemini models for efficient mobile inference
  • No per-token inference cost for developers — compute runs on the user's device

Best for

  • On-device summarization (e.g. Summarize in Recorder on Pixel)
  • Smart reply / suggested responses in messaging keyboards
  • Offline text completion and rewriting in mobile apps
  • Privacy-sensitive features where data must stay on the device
  • Reading comprehension and short-form generation on edge hardware

How to access

FAQ

What is Gemini 1.0 Nano?

Gemini 1.0 Nano is the smallest, most efficient model in Google's original Gemini 1.0 family, designed to run on-device rather than in the cloud. It launched on December 6, 2023 and first shipped on the Pixel 8 Pro, running on the Tensor G3 chip through Android's AICore system service so features can work offline and keep data on the phone.

How big is Gemini Nano and how is it deployed?

Google's Gemini 1.0 technical report describes two sizes: Nano-1 at 1.8 billion parameters (for lower-memory devices) and Nano-2 at 3.25 billion parameters (for higher-memory devices). Both are distilled from the larger Gemini models and 4-bit quantized for on-device deployment.

How did Gemini Nano score on benchmarks?

Per Table 3 of the Gemini 1.0 technical report, Nano-2 / Nano-1 scored 79.3 / 71.6 on BoolQ, 74.2 / 68.9 on TydiQA (GoldP), 55.8 / 45.9 on MMLU (5-shot), 42.4 / 34.8 on BIG-Bench-Hard (3-shot), 27.2 / 20.0 on MBPP, and 22.8 / 13.5 on MATH (4-shot). Factuality (retrieval) tasks are its relative strength for the size.

How can developers use Gemini Nano, and what does it cost?

On Android, developers reach Gemini Nano through ML Kit's GenAI APIs, built on the AICore system service; in Chrome, it backs the built-in AI Prompt API (window.ai). Because inference runs on the user's device, there is no per-token API cost — the trade-off is device hardware requirements and app/model storage.