GPT-4o

Name: GPT-4o
Author: OpenAI

OpenAI's first "omni" model — one network reasoning across text, vision, and audio in real time.

Overview

GPT-4o ("o" for "omni") was OpenAI's flagship model, announced on May 13, 2024. It was the first OpenAI model trained end-to-end as a single network across text, vision, and audio, so all inputs and outputs flow through the same neural network instead of stitching together separate transcription, text, and speech models. That design let GPT-4o respond to audio in as little as 232 milliseconds (about 320ms on average) — close to human conversational latency — and powered ChatGPT's natural, interruptible Advanced Voice Mode.

On text and code, GPT-4o roughly matched the older GPT-4 Turbo while being faster and cheaper, with notably better performance on non-English languages (it supports over 50 languages). On OpenAI's own reproducible simple-evals harness, the gpt-4o-2024-05-13 snapshot scored 87.2% on MMLU, 91.0% on HumanEval coding, 76.6% on the MATH benchmark, and 49.9% on GPQA. It had a 128K-token context window and an October 2023 knowledge cutoff. The launch snapshot capped output at 4,096 tokens; the later gpt-4o-2024-08-06 snapshot raised that to 16,384.

GPT-4o launched in the API at $5.00 per million input tokens and $15.00 per million output tokens; OpenAI cut that roughly in half (to $2.50 / $10.00) later in 2024. It anchored ChatGPT — including the free tier — for over a year before being superseded by the o-series reasoning models and GPT-5. GPT-4o was retired from ChatGPT on February 13, 2026, though it remained available in the API after that, with OpenAI pointing users to newer GPT-5.x models.

Released	2024-05-13
License	Proprietary
Weights	API only
Parameters	Not disclosed
Context	128K
Max output	16K
Architecture	Proprietary transformer trained end-to-end as a single "omni" network across text, vision, and audio — all inputs and outputs are processed by the same neural network, rather than chaining separate speech-to-text, text, and text-to-speech models. Non-reasoning (no extended chain-of-thought mode). Parameter count not disclosed.
Knowledge cutoff	October 2023
Modalities	Text, Image, Audio, Vision
Status	Retired from ChatGPT on February 13, 2026 (with extended Custom GPT access for some enterprise plans into April 2026); remained available in the API after that date. OpenAI cited that only about 0.1% of daily users still chose it, with usage having shifted to GPT-5.2.

Benchmarks

Scores on a 0–100 scale (25-point gridlines); higher is better. Each benchmark links to its published source.

Pricing

Input	$5.00 per million tokens
Output	$15.00 per million tokens

Launch (May 2024) API price; OpenAI later cut it to $2.50 / $10.00 per million tokens.

Pricing source ↗

Strengths

Native omni multimodality — text, image and audio handled by one end-to-end model
Real-time voice: ~232ms audio response (320ms average), close to human latency
Strong, broad coding (91.0% HumanEval) and general knowledge (87.2% MMLU)
Large 128K-token context window
Much faster and cheaper than the GPT-4 Turbo it replaced (50% lower API price at launch)
Strong multilingual performance across 50+ languages

Best for

Real-time voice assistants and conversational agents (Advanced Voice Mode)
General coding, code generation and debugging
Multilingual chat, translation and summarization
Vision tasks: reading charts, diagrams, screenshots and documents
Long-document understanding within the 128K context window
High-volume chat applications where latency and cost matter

How to access

Provider	Model ID
OpenAI ↗	`gpt-4o-2024-05-13`

GPT (Flagship / Thinking) — every version

The full lineage of the GPT (Flagship / Thinking) line, newest first. Every version has its own page — click any to compare specs, benchmarks and pricing.

Version	Released	Context	License
GPT-5.5current	2026-04-23	1.05M	Proprietary
GPT-5.4	2026-03-05	—	Proprietary
GPT-5.2	2025-12-11	—	Proprietary
GPT-5.1	2025-11-12	—	Proprietary
GPT-5	2025-08-07	—	Proprietary
GPT-4o	2024-05-13	—	Proprietary

FAQ

What does the "o" in GPT-4o stand for?

"o" stands for "omni." GPT-4o was OpenAI's first model trained end-to-end as a single network across text, vision, and audio, so one neural network handles all inputs and outputs instead of chaining separate speech, text, and image models.

What were GPT-4o's benchmark scores?

On OpenAI's reproducible simple-evals harness, the gpt-4o-2024-05-13 snapshot scored 87.2% on MMLU, 91.0% on HumanEval, 76.6% on MATH, 49.9% on GPQA, 89.9% on MGSM, and 83.7% F1 on DROP.

How much did GPT-4o cost?

At launch in May 2024 it was $5.00 per million input tokens and $15.00 per million output tokens — 50% cheaper than GPT-4 Turbo. OpenAI later cut the price to $2.50 / $10.00 per million tokens.

Is GPT-4o still available?

It was retired from ChatGPT on February 13, 2026 but remained available in the API after that. OpenAI recommends newer GPT-5.x models as successors.

// Overview

// Benchmarks

// Pricing

// Strengths

// Best for

// How to access

// GPT (Flagship / Thinking) — every version

// FAQ