AI/TLDR

GPT-4o

OpenAI's first "omni" model — one network reasoning across text, vision, and audio in real time.

Overview

GPT-4o ("o" for "omni") was OpenAI's flagship model, announced on May 13, 2024. It was the first OpenAI model trained end-to-end as a single network across text, vision, and audio, so all inputs and outputs flow through the same neural network instead of stitching together separate transcription, text, and speech models. That design let GPT-4o respond to audio in as little as 232 milliseconds (about 320ms on average) — close to human conversational latency — and powered ChatGPT's natural, interruptible Advanced Voice Mode.

On text and code, GPT-4o roughly matched the older GPT-4 Turbo while being faster and cheaper, with notably better performance on non-English languages (it supports over 50 languages). On OpenAI's own reproducible simple-evals harness, the gpt-4o-2024-05-13 snapshot scored 87.2% on MMLU, 91.0% on HumanEval coding, 76.6% on the MATH benchmark, and 49.9% on GPQA. It had a 128K-token context window and an October 2023 knowledge cutoff. The launch snapshot capped output at 4,096 tokens; the later gpt-4o-2024-08-06 snapshot raised that to 16,384.

GPT-4o launched in the API at $5.00 per million input tokens and $15.00 per million output tokens; OpenAI cut that roughly in half (to $2.50 / $10.00) later in 2024. It anchored ChatGPT — including the free tier — for over a year before being superseded by the o-series reasoning models and GPT-5. GPT-4o was retired from ChatGPT on February 13, 2026, though it remained available in the API after that, with OpenAI pointing users to newer GPT-5.x models.

Released2024-05-13
LicenseProprietary
WeightsAPI only
ParametersNot disclosed
Context128K
Max output16K
ArchitectureProprietary transformer trained end-to-end as a single "omni" network across text, vision, and audio — all inputs and outputs are processed by the same neural network, rather than chaining separate speech-to-text, text, and text-to-speech models. Non-reasoning (no extended chain-of-thought mode). Parameter count not disclosed.
Knowledge cutoffOctober 2023
ModalitiesText, Image, Audio, Vision
StatusRetired from ChatGPT on February 13, 2026 (with extended Custom GPT access for some enterprise plans into April 2026); remained available in the API after that date. OpenAI cited that only about 0.1% of daily users still chose it, with usage having shifted to GPT-5.2.

Benchmarks

  1. MMLU87.2%
  2. HumanEval (coding)91%
  3. MATH76.6%
  4. GPQA49.9%
  5. MGSM (multilingual math)89.9%
  6. DROP (F1, 3-shot)83.7%
  7. SimpleQA39%

Scores on a 0–100 scale (25-point gridlines); higher is better. Each benchmark links to its published source.

Pricing

Input$5.00 per million tokens
Output$15.00 per million tokens

Launch (May 2024) API price; OpenAI later cut it to $2.50 / $10.00 per million tokens.

Pricing source ↗

Strengths

  • Native omni multimodality — text, image and audio handled by one end-to-end model
  • Real-time voice: ~232ms audio response (320ms average), close to human latency
  • Strong, broad coding (91.0% HumanEval) and general knowledge (87.2% MMLU)
  • Large 128K-token context window
  • Much faster and cheaper than the GPT-4 Turbo it replaced (50% lower API price at launch)
  • Strong multilingual performance across 50+ languages

Best for

  • Real-time voice assistants and conversational agents (Advanced Voice Mode)
  • General coding, code generation and debugging
  • Multilingual chat, translation and summarization
  • Vision tasks: reading charts, diagrams, screenshots and documents
  • Long-document understanding within the 128K context window
  • High-volume chat applications where latency and cost matter

How to access

ProviderModel ID
OpenAI ↗gpt-4o-2024-05-13

GPT (Flagship / Thinking) — every version

The full lineage of the GPT (Flagship / Thinking) line, newest first. Every version has its own page — click any to compare specs, benchmarks and pricing.

VersionReleasedContextLicense
GPT-5.5current2026-04-231.05MProprietary
GPT-5.42026-03-05Proprietary
GPT-5.22025-12-11Proprietary
GPT-5.12025-11-12Proprietary
GPT-52025-08-07Proprietary
GPT-4o2024-05-13Proprietary

FAQ

What does the "o" in GPT-4o stand for?

"o" stands for "omni." GPT-4o was OpenAI's first model trained end-to-end as a single network across text, vision, and audio, so one neural network handles all inputs and outputs instead of chaining separate speech, text, and image models.

What were GPT-4o's benchmark scores?

On OpenAI's reproducible simple-evals harness, the gpt-4o-2024-05-13 snapshot scored 87.2% on MMLU, 91.0% on HumanEval, 76.6% on MATH, 49.9% on GPQA, 89.9% on MGSM, and 83.7% F1 on DROP.

How much did GPT-4o cost?

At launch in May 2024 it was $5.00 per million input tokens and $15.00 per million output tokens — 50% cheaper than GPT-4 Turbo. OpenAI later cut the price to $2.50 / $10.00 per million tokens.

Is GPT-4o still available?

It was retired from ChatGPT on February 13, 2026 but remained available in the API after that. OpenAI recommends newer GPT-5.x models as successors.