AI/TLDR

GPT-5 mini

OpenAI's fast, low-cost GPT-5 with a 400K context window and reasoning

Overview

GPT-5 mini is the mid-tier model in OpenAI's GPT-5 family, launched on August 7, 2025 alongside GPT-5 and GPT-5 nano. OpenAI positions GPT-5 mini as a faster, more cost-efficient version of GPT-5 built for well-defined tasks and precise prompts, trading some of the flagship's depth for lower latency and a far cheaper price.

Like the rest of the GPT-5 line, GPT-5 mini is a reasoning model: it can spend extra tokens 'thinking' before it answers, and OpenAI's safety documentation refers to this variant as gpt-5-thinking-mini. It accepts text and image input and returns text, with a 400,000-token context window and up to 128,000 output tokens, making it suitable for long documents and code while keeping inference inexpensive.

GPT-5 mini is proprietary and closed-weight, available only through OpenAI's API (model id gpt-5-mini) and resellers such as OpenRouter. It is priced at $0.25 per million input tokens and $2.00 per million output tokens, with cached input at $0.025 per million. Its knowledge cutoff is May 31, 2024.

Released2025-08-07
LicenseProprietary
WeightsAPI only
Context400K
Max output128K
ArchitectureProprietary reasoning model; OpenAI does not disclose parameter count or architecture details.
Knowledge cutoffMay 31, 2024
ModalitiesText, Vision
StatusAvailable

Benchmarks

  1. HealthBench Hard40.3%
  2. SimpleQA (accuracy, no web)22%

Scores on a 0–100 scale (25-point gridlines); higher is better. Each benchmark links to its published source.

Pricing

Input$0.25 / 1M tokens per 1M tokens
Cached input$0.025 / 1M tokens per 1M tokens
Output$2.00 / 1M tokens per 1M tokens

Pricing source ↗

Strengths

  • Low price for a reasoning-capable model: $0.25 input / $2.00 output per million tokens, with cached input at $0.025
  • Large 400K-token context window for long documents, transcripts, and codebases
  • Up to 128K output tokens for long-form generation and detailed reasoning traces
  • Adjustable reasoning effort, so you can dial cost and latency up or down per request
  • Text plus image (vision) input through a single API model
  • Available via OpenAI's Chat Completions and Responses APIs, plus OpenRouter

Best for

  • High-volume, well-defined tasks where a precise prompt matters more than maximum model depth
  • Cost-sensitive classification, extraction, and summarization over long inputs
  • Customer-support and assistant backends that need cheap reasoning at scale
  • Document and codebase question-answering that benefits from the 400K context window
  • Vision tasks such as reading screenshots, charts, or scanned pages from text-plus-image prompts
  • A cheaper fallback or router tier behind a larger GPT-5 model for easy requests

How to access

ProviderModel ID
OpenAI ↗gpt-5-mini
OpenRouter ↗openai/gpt-5-mini

GPT Mini — every version

The full lineage of the GPT Mini line, newest first. Every version has its own page — click any to compare specs, benchmarks and pricing.

VersionReleasedContextLicense
GPT-5.4 minicurrent2026-03-17Proprietary
GPT-5 mini2025-08-07Proprietary
GPT-4o mini2024-07-18Proprietary

FAQ

How much does GPT-5 mini cost?

GPT-5 mini costs $0.25 per million input tokens and $2.00 per million output tokens through OpenAI's API, with cached input billed at $0.025 per million tokens.

What is GPT-5 mini's context window?

GPT-5 mini has a 400,000-token context window and can return up to 128,000 output tokens per request.

Is GPT-5 mini open source or downloadable?

No. GPT-5 mini is proprietary and closed-weight. It is only available through OpenAI's API (model id gpt-5-mini) and resellers such as OpenRouter; the weights cannot be downloaded or self-hosted.

What can GPT-5 mini take as input?

GPT-5 mini accepts text and image (vision) input and returns text. It is a reasoning model with adjustable reasoning effort, and its knowledge cutoff is May 31, 2024.