GPT-4o mini

Name: GPT-4o mini
Author: OpenAI

OpenAI's cheap, fast small model that replaced GPT-3.5 Turbo as the everyday workhorse.

Overview

GPT-4o mini is OpenAI's small, low-cost model introduced on July 18, 2024 as the successor to GPT-3.5 Turbo. It launched at $0.15 per million input tokens and $0.60 per million output tokens — which OpenAI said made it more than 60% cheaper than GPT-3.5 Turbo — and immediately became the default model for free-tier ChatGPT users while also being available in the API.

Despite the low price, GPT-4o mini posted strong scores for its tier: OpenAI reported 82.0% on MMLU, 87.0% on the MGSM math benchmark, and 87.2% on HumanEval coding, beating rival small models of the era such as Google's Gemini 1.5 Flash and Anthropic's Claude 3 Haiku on those tests. It carries a 128K-token context window, returns up to 16,384 output tokens per request, and has an October 2023 knowledge cutoff.

At launch GPT-4o mini supported text and vision (image) inputs with text output; OpenAI said video and audio support could come later via the API. For two years it was OpenAI's go-to model for high-volume, latency-sensitive, and cost-sensitive tasks like classification, extraction, summarization, and chat assistants. It remains available through the OpenAI API even after the older 4o-class models were retired from the ChatGPT interface during OpenAI's 2026 legacy-model cleanup.

Released	2024-07-18
License	Proprietary (OpenAI commercial terms; available via API and ChatGPT)
Weights	API only
Parameters	Not disclosed by OpenAI
Context	128K tokens
Max output	16,384 tokens
Architecture	Proprietary multimodal Transformer (decoder-only). OpenAI did not publish parameter counts or architectural details; the model accepts text and image inputs and returns text.
Knowledge cutoff	October 2023
Modalities	text input, image input (vision), text output
Status	Available via the OpenAI API. Launched July 18, 2024 as the default free-tier ChatGPT model; as part of OpenAI's legacy-model phase-out the older 4o-class models were removed from the ChatGPT model picker in early 2026, but gpt-4o-mini remains a supported API model with no announced sunset date.

Benchmarks

Scores on a 0–100 scale (25-point gridlines); higher is better. Each benchmark links to its published source.

Pricing

Input	$0.15 per 1M tokens per 1M tokens
Output	$0.60 per 1M tokens per 1M tokens

Launch pricing as announced by OpenAI on July 18, 2024; OpenAI stated this was more than 60% cheaper than GPT-3.5 Turbo. Batch API and cached-input discounts apply separately.

Pricing source ↗

Strengths

Very low price — $0.15 input / $0.60 output per million tokens, far cheaper than GPT-4o and a steep cut from GPT-3.5 Turbo
Fast, low-latency responses suited to high-throughput and real-time workloads
Strong reasoning and coding for its size class (MMLU 82%, HumanEval 87.2%) — outscored Gemini 1.5 Flash and Claude 3 Haiku at launch
128K-token context window — long enough to process large documents and long conversations
Native vision: accepts image inputs alongside text
Supports function/tool calling, structured outputs, and fine-tuning

Best for

High-volume text classification, tagging, and content moderation where cost per call matters
Information extraction and structured-data parsing from documents and emails
Summarization of long documents, transcripts, and chat logs within the 128K window
Customer-facing chatbots and assistants needing fast, cheap responses
Drafting, rewriting, and routine coding help where frontier-model quality isn't required
Function/tool-calling agents and pipelines that fan out many small LLM calls

How to access

Provider	Model ID
OpenAI ↗	`gpt-4o-mini`
OpenAI (dated snapshot) ↗	`gpt-4o-mini-2024-07-18`

GPT Mini — every version

The full lineage of the GPT Mini line, newest first. Every version has its own page — click any to compare specs, benchmarks and pricing.

Version	Released	Context	License
GPT-5.4 minicurrent	2026-03-17	—	Proprietary
GPT-5 mini	2025-08-07	—	Proprietary
GPT-4o mini	2024-07-18	—	Proprietary

FAQ

How much does GPT-4o mini cost?

At launch (July 18, 2024) OpenAI priced GPT-4o mini at $0.15 per million input tokens and $0.60 per million output tokens — which OpenAI said made it more than 60% cheaper than GPT-3.5 Turbo. Discounts apply for cached inputs and the Batch API.

What is GPT-4o mini's context window?

GPT-4o mini has a 128K-token context window and can return up to 16,384 tokens of output per request. Its training knowledge runs through October 2023.

Is GPT-4o mini multimodal?

Yes, for input: it accepts both text and images (vision) and produces text output. At launch OpenAI said audio and video support could be added later via the API.

Is GPT-4o mini still available?

Yes. It remains a supported model in the OpenAI API with no announced sunset date. The older 4o-class models were removed from the ChatGPT model picker during OpenAI's 2026 legacy-model phase-out, but gpt-4o-mini continues to be offered through the API.

// Overview

// Benchmarks

// Pricing

// Strengths

// Best for

// How to access

// GPT Mini — every version

// FAQ