Overview
GPT-4o mini is OpenAI's small, low-cost model introduced on July 18, 2024 as the successor to GPT-3.5 Turbo. It launched at $0.15 per million input tokens and $0.60 per million output tokens — which OpenAI said made it more than 60% cheaper than GPT-3.5 Turbo — and immediately became the default model for free-tier ChatGPT users while also being available in the API.
Despite the low price, GPT-4o mini posted strong scores for its tier: OpenAI reported 82.0% on MMLU, 87.0% on the MGSM math benchmark, and 87.2% on HumanEval coding, beating rival small models of the era such as Google's Gemini 1.5 Flash and Anthropic's Claude 3 Haiku on those tests. It carries a 128K-token context window, returns up to 16,384 output tokens per request, and has an October 2023 knowledge cutoff.
At launch GPT-4o mini supported text and vision (image) inputs with text output; OpenAI said video and audio support could come later via the API. For two years it was OpenAI's go-to model for high-volume, latency-sensitive, and cost-sensitive tasks like classification, extraction, summarization, and chat assistants. It remains available through the OpenAI API even after the older 4o-class models were retired from the ChatGPT interface during OpenAI's 2026 legacy-model cleanup.
| Released | 2024-07-18 |
|---|---|
| License | Proprietary (OpenAI commercial terms; available via API and ChatGPT) |
| Weights | API only |
| Parameters | Not disclosed by OpenAI |
| Context | 128K tokens |
| Max output | 16,384 tokens |
| Architecture | Proprietary multimodal Transformer (decoder-only). OpenAI did not publish parameter counts or architectural details; the model accepts text and image inputs and returns text. |
| Knowledge cutoff | October 2023 |
| Modalities | text input, image input (vision), text output |
| Status | Available via the OpenAI API. Launched July 18, 2024 as the default free-tier ChatGPT model; as part of OpenAI's legacy-model phase-out the older 4o-class models were removed from the ChatGPT model picker in early 2026, but gpt-4o-mini remains a supported API model with no announced sunset date. |
Benchmarks
- MMLU (5-shot)82%
- MGSM (math, 0-shot CoT)87%
- HumanEval (coding)87.2%
- MMMU (multimodal reasoning)59.4%
- MATH70.2%
- GPQA40.2%
- DROP (F1)79.7%
Scores on a 0–100 scale (25-point gridlines); higher is better. Each benchmark links to its published source.
Pricing
| Input | $0.15 per 1M tokens per 1M tokens |
|---|---|
| Output | $0.60 per 1M tokens per 1M tokens |
Launch pricing as announced by OpenAI on July 18, 2024; OpenAI stated this was more than 60% cheaper than GPT-3.5 Turbo. Batch API and cached-input discounts apply separately.
Strengths
- Very low price — $0.15 input / $0.60 output per million tokens, far cheaper than GPT-4o and a steep cut from GPT-3.5 Turbo
- Fast, low-latency responses suited to high-throughput and real-time workloads
- Strong reasoning and coding for its size class (MMLU 82%, HumanEval 87.2%) — outscored Gemini 1.5 Flash and Claude 3 Haiku at launch
- 128K-token context window — long enough to process large documents and long conversations
- Native vision: accepts image inputs alongside text
- Supports function/tool calling, structured outputs, and fine-tuning
Best for
- High-volume text classification, tagging, and content moderation where cost per call matters
- Information extraction and structured-data parsing from documents and emails
- Summarization of long documents, transcripts, and chat logs within the 128K window
- Customer-facing chatbots and assistants needing fast, cheap responses
- Drafting, rewriting, and routine coding help where frontier-model quality isn't required
- Function/tool-calling agents and pipelines that fan out many small LLM calls
How to access
| Provider | Model ID |
|---|---|
| OpenAI ↗ | gpt-4o-mini |
| OpenAI (dated snapshot) ↗ | gpt-4o-mini-2024-07-18 |
GPT Mini — every version
The full lineage of the GPT Mini line, newest first. Every version has its own page — click any to compare specs, benchmarks and pricing.
| Version | Released | Context | License |
|---|---|---|---|
| GPT-5.4 minicurrent | 2026-03-17 | — | Proprietary |
| GPT-5 mini | 2025-08-07 | — | Proprietary |
| GPT-4o mini | 2024-07-18 | — | Proprietary |
FAQ
How much does GPT-4o mini cost?
At launch (July 18, 2024) OpenAI priced GPT-4o mini at $0.15 per million input tokens and $0.60 per million output tokens — which OpenAI said made it more than 60% cheaper than GPT-3.5 Turbo. Discounts apply for cached inputs and the Batch API.
What is GPT-4o mini's context window?
GPT-4o mini has a 128K-token context window and can return up to 16,384 tokens of output per request. Its training knowledge runs through October 2023.
Is GPT-4o mini multimodal?
Yes, for input: it accepts both text and images (vision) and produces text output. At launch OpenAI said audio and video support could be added later via the API.
Is GPT-4o mini still available?
Yes. It remains a supported model in the OpenAI API with no announced sunset date. The older 4o-class models were removed from the ChatGPT model picker during OpenAI's 2026 legacy-model phase-out, but gpt-4o-mini continues to be offered through the API.