Overview
Gemini 1.5 Pro was Google DeepMind's mid-size multimodal model, announced on February 15, 2024, just two months after the first Gemini 1.0 launch. Its headline feature was an enormous context window: it shipped to early testers with 1 million tokens and later reached 2 million tokens at general availability, far beyond the 128K and 200K windows that rival models offered at the time. That window let a single prompt hold roughly an hour of video, around 11 hours of audio, codebases over 30,000 lines, or 700,000-plus words.
Built on a sparse Mixture-of-Experts (MoE) Transformer, Gemini 1.5 Pro reached quality comparable to the much larger Gemini 1.0 Ultra while using significantly less compute. It was natively multimodal, accepting interleaved text, images, audio, and video in the same request, and it scored 99% on the long-context "needle in a haystack" retrieval test at the full 1-million-token length, a result Google highlighted as evidence the long context was genuinely usable rather than nominal.
Gemini 1.5 Pro is now retired. The 001 version was discontinued on May 27, 2025, and the 002 version (plus the gemini-1.5-pro alias) was shut down on September 24, 2025, after which API calls return errors. Google has steered developers to the newer Gemini 2.0 and 2.5 models. The entry below documents its specs, benchmarks, and historical pricing for reference.
| Released | 2024-02-15 |
|---|---|
| License | Proprietary (Google). Available only as a hosted API; not redistributable. |
| Weights | API only |
| Parameters | Not publicly disclosed |
| Context | Up to 2,097,152 tokens (2M) at general availability; launched at 1M tokens in Feb 2024, with a 128K standard tier at first |
| Max output | 8,192 tokens |
| Architecture | Sparse Mixture-of-Experts (MoE) Transformer. Instead of activating one large dense network for every token, MoE routes each token through a subset of smaller "expert" sub-networks, which let Gemini 1.5 Pro reach 1.0 Ultra-level quality using less compute. |
| Knowledge cutoff | November 2023 |
| Modalities | text, image, audio, video, code |
| Status | Retired. Gemini 1.5 Pro 001 was discontinued on May 27, 2025; Gemini 1.5 Pro 002 and the gemini-1.5-pro alias were shut down on September 24, 2025. Google directs users to the Gemini 2.0 / 2.5 family. |
Benchmarks
- MMLU (5-shot)85.9%
- MATH82.9%
- HumanEval89%
- MBPP87.8%
- MMLU-Pro76.1%
- GPQA Diamond53.5%
Scores on a 0–100 scale (25-point gridlines); higher is better. Each benchmark links to its published source.
Pricing
| Input | $1.25 / 1M tokens (prompts ≤128K); $2.50 / 1M tokens (prompts >128K) per 1M tokens |
|---|---|
| Output | $5.00 / 1M tokens (prompts ≤128K); $10.00 / 1M tokens (prompts >128K) per 1M tokens |
Historical paid-tier pricing in effect after the October 1, 2024 price reduction (which cut input ~64% and output ~52%) until the model's retirement. The model is no longer purchasable.
Strengths
- Very long context: up to 2 million tokens at GA, enough to load entire books, codebases, or long videos into a single prompt
- Strong long-context retrieval — 99% accuracy on needle-in-a-haystack at the full 1M-token length
- Native multimodality across text, images, audio, and video in one request
- Efficient MoE design delivered near-1.0-Ultra quality at lower cost
- Context caching support to cut cost on repeated large prompts
Best for
- Analyzing very long documents, contracts, or research-paper bundles in one pass
- Whole-repository code understanding and Q&A over large codebases
- Summarizing or querying long-form video and audio (lectures, meetings, podcasts)
- Retrieval-augmented workflows where a huge context replaces or supplements a vector database
- Multimodal extraction and reasoning over mixed text/image/audio/video inputs
How to access
| Provider | Model ID |
|---|---|
| Google AI Studio / Gemini API ↗ | gemini-1.5-pro (retired) |
| Google Cloud Vertex AI ↗ | gemini-1.5-pro-002 (retired) |
Gemini Pro — every version
The full lineage of the Gemini Pro line, newest first. Every version has its own page — click any to compare specs, benchmarks and pricing.
| Version | Released | Context | License |
|---|---|---|---|
| Gemini 3.5 Pro | 2026-05-19 | — | Proprietary |
| Gemini 3.1 Procurrent | 2026-02-19 | 1M | Proprietary |
| Gemini 3 Pro | 2025-11-18 | — | Proprietary |
| Gemini 2.5 Pro | 2025-03-25 | — | Proprietary |
| Gemini 2.0 Pro | 2025-02-05 | — | Proprietary |
| Gemini 1.5 Pro | 2024-02-15 | — | Proprietary |
| Gemini 1.0 Ultra | 2024-02-08 | — | Proprietary |
| Gemini 1.0 Pro | 2023-12-13 | — | Proprietary |
FAQ
Is Gemini 1.5 Pro still available?
No. Gemini 1.5 Pro is retired. The 001 version was discontinued on May 27, 2025, and the 002 version (plus the gemini-1.5-pro alias) was shut down on September 24, 2025, after which API calls error out. Google recommends migrating to the Gemini 2.0 or 2.5 family.
How big was the Gemini 1.5 Pro context window?
It launched in February 2024 with a 1-million-token window for early testers (and a 128K standard tier), then reached 2 million tokens (2,097,152) at general availability — large enough to hold roughly an hour of video, about 11 hours of audio, or 700,000-plus words in a single prompt.
What did Gemini 1.5 Pro cost?
After the October 2024 price cut, the paid tier was $1.25 per million input tokens and $5.00 per million output tokens for prompts up to 128K tokens, doubling to $2.50 / $10.00 per million for prompts over 128K. Those prices applied until the model was retired.
What architecture did Gemini 1.5 Pro use?
It used a sparse Mixture-of-Experts (MoE) Transformer. Rather than running one large dense network for every token, MoE routes each token through a subset of smaller expert sub-networks, which let Gemini 1.5 Pro match Gemini 1.0 Ultra-level quality using less compute.