Gemini 1.5 Pro

Name: Gemini 1.5 Pro
Author: Google

Google's long-context multimodal model that popularized the 1-million-token window.

Overview

Gemini 1.5 Pro was Google DeepMind's mid-size multimodal model, announced on February 15, 2024, just two months after the first Gemini 1.0 launch. Its headline feature was an enormous context window: it shipped to early testers with 1 million tokens and later reached 2 million tokens at general availability, far beyond the 128K and 200K windows that rival models offered at the time. That window let a single prompt hold roughly an hour of video, around 11 hours of audio, codebases over 30,000 lines, or 700,000-plus words.

Built on a sparse Mixture-of-Experts (MoE) Transformer, Gemini 1.5 Pro reached quality comparable to the much larger Gemini 1.0 Ultra while using significantly less compute. It was natively multimodal, accepting interleaved text, images, audio, and video in the same request, and it scored 99% on the long-context "needle in a haystack" retrieval test at the full 1-million-token length, a result Google highlighted as evidence the long context was genuinely usable rather than nominal.

Gemini 1.5 Pro is now retired. The 001 version was discontinued on May 27, 2025, and the 002 version (plus the gemini-1.5-pro alias) was shut down on September 24, 2025, after which API calls return errors. Google has steered developers to the newer Gemini 2.0 and 2.5 models. The entry below documents its specs, benchmarks, and historical pricing for reference.

Released	2024-02-15
License	Proprietary (Google). Available only as a hosted API; not redistributable.
Weights	API only
Parameters	Not publicly disclosed
Context	Up to 2,097,152 tokens (2M) at general availability; launched at 1M tokens in Feb 2024, with a 128K standard tier at first
Max output	8,192 tokens
Architecture	Sparse Mixture-of-Experts (MoE) Transformer. Instead of activating one large dense network for every token, MoE routes each token through a subset of smaller "expert" sub-networks, which let Gemini 1.5 Pro reach 1.0 Ultra-level quality using less compute.
Knowledge cutoff	November 2023
Modalities	text, image, audio, video, code
Status	Retired. Gemini 1.5 Pro 001 was discontinued on May 27, 2025; Gemini 1.5 Pro 002 and the gemini-1.5-pro alias were shut down on September 24, 2025. Google directs users to the Gemini 2.0 / 2.5 family.

Benchmarks

Scores on a 0–100 scale (25-point gridlines); higher is better. Each benchmark links to its published source.

Pricing

Input	$1.25 / 1M tokens (prompts ≤128K); $2.50 / 1M tokens (prompts >128K) per 1M tokens
Output	$5.00 / 1M tokens (prompts ≤128K); $10.00 / 1M tokens (prompts >128K) per 1M tokens

Historical paid-tier pricing in effect after the October 1, 2024 price reduction (which cut input ~64% and output ~52%) until the model's retirement. The model is no longer purchasable.

Pricing source ↗

Strengths

Very long context: up to 2 million tokens at GA, enough to load entire books, codebases, or long videos into a single prompt
Strong long-context retrieval — 99% accuracy on needle-in-a-haystack at the full 1M-token length
Native multimodality across text, images, audio, and video in one request
Efficient MoE design delivered near-1.0-Ultra quality at lower cost
Context caching support to cut cost on repeated large prompts

Best for

Analyzing very long documents, contracts, or research-paper bundles in one pass
Whole-repository code understanding and Q&A over large codebases
Summarizing or querying long-form video and audio (lectures, meetings, podcasts)
Retrieval-augmented workflows where a huge context replaces or supplements a vector database
Multimodal extraction and reasoning over mixed text/image/audio/video inputs

How to access

Provider	Model ID
Google AI Studio / Gemini API ↗	`gemini-1.5-pro (retired)`
Google Cloud Vertex AI ↗	`gemini-1.5-pro-002 (retired)`

Gemini Pro — every version

The full lineage of the Gemini Pro line, newest first. Every version has its own page — click any to compare specs, benchmarks and pricing.

Version	Released	Context	License
Gemini 3.5 Pro	2026-05-19	—	Proprietary
Gemini 3.1 Procurrent	2026-02-19	1M	Proprietary
Gemini 3 Pro	2025-11-18	—	Proprietary
Gemini 2.5 Pro	2025-03-25	—	Proprietary
Gemini 2.0 Pro	2025-02-05	—	Proprietary
Gemini 1.5 Pro	2024-02-15	—	Proprietary
Gemini 1.0 Ultra	2024-02-08	—	Proprietary
Gemini 1.0 Pro	2023-12-13	—	Proprietary

FAQ

Is Gemini 1.5 Pro still available?

No. Gemini 1.5 Pro is retired. The 001 version was discontinued on May 27, 2025, and the 002 version (plus the gemini-1.5-pro alias) was shut down on September 24, 2025, after which API calls error out. Google recommends migrating to the Gemini 2.0 or 2.5 family.

How big was the Gemini 1.5 Pro context window?

It launched in February 2024 with a 1-million-token window for early testers (and a 128K standard tier), then reached 2 million tokens (2,097,152) at general availability — large enough to hold roughly an hour of video, about 11 hours of audio, or 700,000-plus words in a single prompt.

What did Gemini 1.5 Pro cost?

After the October 2024 price cut, the paid tier was $1.25 per million input tokens and $5.00 per million output tokens for prompts up to 128K tokens, doubling to $2.50 / $10.00 per million for prompts over 128K. Those prices applied until the model was retired.

What architecture did Gemini 1.5 Pro use?

It used a sparse Mixture-of-Experts (MoE) Transformer. Rather than running one large dense network for every token, MoE routes each token through a subset of smaller expert sub-networks, which let Gemini 1.5 Pro match Gemini 1.0 Ultra-level quality using less compute.

// Overview

// Benchmarks

// Pricing

// Strengths

// Best for

// How to access

// Gemini Pro — every version

// FAQ