Overview
GPT-5 mini is the mid-tier model in OpenAI's GPT-5 family, launched on August 7, 2025 alongside GPT-5 and GPT-5 nano. OpenAI positions GPT-5 mini as a faster, more cost-efficient version of GPT-5 built for well-defined tasks and precise prompts, trading some of the flagship's depth for lower latency and a far cheaper price.
Like the rest of the GPT-5 line, GPT-5 mini is a reasoning model: it can spend extra tokens 'thinking' before it answers, and OpenAI's safety documentation refers to this variant as gpt-5-thinking-mini. It accepts text and image input and returns text, with a 400,000-token context window and up to 128,000 output tokens, making it suitable for long documents and code while keeping inference inexpensive.
GPT-5 mini is proprietary and closed-weight, available only through OpenAI's API (model id gpt-5-mini) and resellers such as OpenRouter. It is priced at $0.25 per million input tokens and $2.00 per million output tokens, with cached input at $0.025 per million. Its knowledge cutoff is May 31, 2024.
| Released | 2025-08-07 |
|---|---|
| License | Proprietary |
| Weights | API only |
| Context | 400K |
| Max output | 128K |
| Architecture | Proprietary reasoning model; OpenAI does not disclose parameter count or architecture details. |
| Knowledge cutoff | May 31, 2024 |
| Modalities | Text, Vision |
| Status | Available |
Benchmarks
Scores on a 0–100 scale (25-point gridlines); higher is better. Each benchmark links to its published source.
Pricing
| Input | $0.25 / 1M tokens per 1M tokens |
|---|---|
| Cached input | $0.025 / 1M tokens per 1M tokens |
| Output | $2.00 / 1M tokens per 1M tokens |
Strengths
- Low price for a reasoning-capable model: $0.25 input / $2.00 output per million tokens, with cached input at $0.025
- Large 400K-token context window for long documents, transcripts, and codebases
- Up to 128K output tokens for long-form generation and detailed reasoning traces
- Adjustable reasoning effort, so you can dial cost and latency up or down per request
- Text plus image (vision) input through a single API model
- Available via OpenAI's Chat Completions and Responses APIs, plus OpenRouter
Best for
- High-volume, well-defined tasks where a precise prompt matters more than maximum model depth
- Cost-sensitive classification, extraction, and summarization over long inputs
- Customer-support and assistant backends that need cheap reasoning at scale
- Document and codebase question-answering that benefits from the 400K context window
- Vision tasks such as reading screenshots, charts, or scanned pages from text-plus-image prompts
- A cheaper fallback or router tier behind a larger GPT-5 model for easy requests
How to access
| Provider | Model ID |
|---|---|
| OpenAI ↗ | gpt-5-mini |
| OpenRouter ↗ | openai/gpt-5-mini |
GPT Mini — every version
The full lineage of the GPT Mini line, newest first. Every version has its own page — click any to compare specs, benchmarks and pricing.
| Version | Released | Context | License |
|---|---|---|---|
| GPT-5.4 minicurrent | 2026-03-17 | — | Proprietary |
| GPT-5 mini | 2025-08-07 | — | Proprietary |
| GPT-4o mini | 2024-07-18 | — | Proprietary |
FAQ
How much does GPT-5 mini cost?
GPT-5 mini costs $0.25 per million input tokens and $2.00 per million output tokens through OpenAI's API, with cached input billed at $0.025 per million tokens.
What is GPT-5 mini's context window?
GPT-5 mini has a 400,000-token context window and can return up to 128,000 output tokens per request.
Is GPT-5 mini open source or downloadable?
No. GPT-5 mini is proprietary and closed-weight. It is only available through OpenAI's API (model id gpt-5-mini) and resellers such as OpenRouter; the weights cannot be downloaded or self-hosted.
What can GPT-5 mini take as input?
GPT-5 mini accepts text and image (vision) input and returns text. It is a reasoning model with adjustable reasoning effort, and its knowledge cutoff is May 31, 2024.