AI/TLDR

GPT-4o mini

OpenAI's cheap, fast small model that replaced GPT-3.5 Turbo as the everyday workhorse.

Overview

GPT-4o mini is OpenAI's small, low-cost model introduced on July 18, 2024 as the successor to GPT-3.5 Turbo. It launched at $0.15 per million input tokens and $0.60 per million output tokens — which OpenAI said made it more than 60% cheaper than GPT-3.5 Turbo — and immediately became the default model for free-tier ChatGPT users while also being available in the API.

Despite the low price, GPT-4o mini posted strong scores for its tier: OpenAI reported 82.0% on MMLU, 87.0% on the MGSM math benchmark, and 87.2% on HumanEval coding, beating rival small models of the era such as Google's Gemini 1.5 Flash and Anthropic's Claude 3 Haiku on those tests. It carries a 128K-token context window, returns up to 16,384 output tokens per request, and has an October 2023 knowledge cutoff.

At launch GPT-4o mini supported text and vision (image) inputs with text output; OpenAI said video and audio support could come later via the API. For two years it was OpenAI's go-to model for high-volume, latency-sensitive, and cost-sensitive tasks like classification, extraction, summarization, and chat assistants. It remains available through the OpenAI API even after the older 4o-class models were retired from the ChatGPT interface during OpenAI's 2026 legacy-model cleanup.

Released2024-07-18
LicenseProprietary (OpenAI commercial terms; available via API and ChatGPT)
WeightsAPI only
ParametersNot disclosed by OpenAI
Context128K tokens
Max output16,384 tokens
ArchitectureProprietary multimodal Transformer (decoder-only). OpenAI did not publish parameter counts or architectural details; the model accepts text and image inputs and returns text.
Knowledge cutoffOctober 2023
Modalitiestext input, image input (vision), text output
StatusAvailable via the OpenAI API. Launched July 18, 2024 as the default free-tier ChatGPT model; as part of OpenAI's legacy-model phase-out the older 4o-class models were removed from the ChatGPT model picker in early 2026, but gpt-4o-mini remains a supported API model with no announced sunset date.

Benchmarks

  1. MMLU (5-shot)82%
  2. MGSM (math, 0-shot CoT)87%
  3. HumanEval (coding)87.2%
  4. MMMU (multimodal reasoning)59.4%
  5. MATH70.2%
  6. GPQA40.2%
  7. DROP (F1)79.7%

Scores on a 0–100 scale (25-point gridlines); higher is better. Each benchmark links to its published source.

Pricing

Input$0.15 per 1M tokens per 1M tokens
Output$0.60 per 1M tokens per 1M tokens

Launch pricing as announced by OpenAI on July 18, 2024; OpenAI stated this was more than 60% cheaper than GPT-3.5 Turbo. Batch API and cached-input discounts apply separately.

Pricing source ↗

Strengths

  • Very low price — $0.15 input / $0.60 output per million tokens, far cheaper than GPT-4o and a steep cut from GPT-3.5 Turbo
  • Fast, low-latency responses suited to high-throughput and real-time workloads
  • Strong reasoning and coding for its size class (MMLU 82%, HumanEval 87.2%) — outscored Gemini 1.5 Flash and Claude 3 Haiku at launch
  • 128K-token context window — long enough to process large documents and long conversations
  • Native vision: accepts image inputs alongside text
  • Supports function/tool calling, structured outputs, and fine-tuning

Best for

  • High-volume text classification, tagging, and content moderation where cost per call matters
  • Information extraction and structured-data parsing from documents and emails
  • Summarization of long documents, transcripts, and chat logs within the 128K window
  • Customer-facing chatbots and assistants needing fast, cheap responses
  • Drafting, rewriting, and routine coding help where frontier-model quality isn't required
  • Function/tool-calling agents and pipelines that fan out many small LLM calls

How to access

ProviderModel ID
OpenAI ↗gpt-4o-mini
OpenAI (dated snapshot) ↗gpt-4o-mini-2024-07-18

GPT Mini — every version

The full lineage of the GPT Mini line, newest first. Every version has its own page — click any to compare specs, benchmarks and pricing.

VersionReleasedContextLicense
GPT-5.4 minicurrent2026-03-17Proprietary
GPT-5 mini2025-08-07Proprietary
GPT-4o mini2024-07-18Proprietary

FAQ

How much does GPT-4o mini cost?

At launch (July 18, 2024) OpenAI priced GPT-4o mini at $0.15 per million input tokens and $0.60 per million output tokens — which OpenAI said made it more than 60% cheaper than GPT-3.5 Turbo. Discounts apply for cached inputs and the Batch API.

What is GPT-4o mini's context window?

GPT-4o mini has a 128K-token context window and can return up to 16,384 tokens of output per request. Its training knowledge runs through October 2023.

Is GPT-4o mini multimodal?

Yes, for input: it accepts both text and images (vision) and produces text output. At launch OpenAI said audio and video support could be added later via the API.

Is GPT-4o mini still available?

Yes. It remains a supported model in the OpenAI API with no announced sunset date. The older 4o-class models were removed from the ChatGPT model picker during OpenAI's 2026 legacy-model phase-out, but gpt-4o-mini continues to be offered through the API.