OpenAI · 2026-04-23 · seismic

GPT-5.5 — OpenAI's Fully Retrained Flagship: 82.7% Terminal-Bench, 1M Context

Item: GPT-5.5 — OpenAI's Fully Retrained Flagship: 82.7% Terminal-Bench, 1M Context
Rating: 5
Author: AI/TLDR

GPT-5.5 is OpenAI's first fully retrained base model since GPT-4.5. It scores 82.7% on Terminal-Bench 2.0 — 13 points ahead of Claude Opus 4.7 — and 84.9% on GDPval. Two variants: standard ($5/$30 per MTok) and Pro ($30/$180). 1,499 HN points.

GPT-5.5 hero image from OpenAI — fully retrained agentic model with Terminal-Bench 2.0 and GDPval benchmark highlights

GPT-5.5 is OpenAI's first fully retrained base model since GPT-4.5: 82.7% on Terminal-Bench 2.0, 13 points ahead of Claude Opus 4.7.

Key specs

Context window	1M tokens
Terminal bench 2.0	82.7%
Gdpval (knowledge work)	84.9%
Swe bench pro	58.6%
Browse comp (pro)	90.1%
Frontier math tier 4 (pro)	39.6%
Osworld verified (computer use)	78.7%
Api input (standard)	$5/MTok
Api output (standard)	$30/MTok
Api input (pro)	$30/MTok
Api output (pro)	$180/MTok
Hn points	1,499

What is it?

GPT-5.5 is OpenAI's latest model, released April 23, 2026 to ChatGPT Plus, Pro, Business, and Enterprise users and to Codex with a 400K context window. Unlike GPT-5.1–5.4, which were post-training modifications of the same base, GPT-5.5 is a full retrain — new base architecture, new pretraining corpus, and agent-oriented training objectives. It comes in two variants: standard (all paid tiers) and Pro (Pro, Business, and Enterprise only). Both support 1 million tokens of context. API access launched alongside the ChatGPT rollout.

How does it work?

GPT-5.5 is purpose-built for agentic work: it plans multi-step tasks, coordinates tools, verifies its own output, and keeps running until a task is done without hand-holding at each step. The agent-oriented training objectives optimized for persistence on long-horizon work rather than single-turn answers. Terminal-Bench 2.0 tests complex command-line workflows requiring planning, iteration, and tool coordination — GPT-5.5 scores 82.7% vs Claude Opus 4.7 at 69.4% and Gemini 3.1 Pro at 68.5%. On GDPval (knowledge work), it matches or outperforms industry professionals at 84.9%. GPT-5.5 Pro scores 39.6% on FrontierMath Tier 4, nearly double Claude Opus 4.7's 22.9% on postdoctoral-level math.

Why does it matter?

A 13-point lead on Terminal-Bench 2.0 is a meaningful gap in agentic coding tasks, where most enterprise AI spend is going in 2026. The first full base retrain since GPT-4.5 means this is a real architectural step, not a fine-tune. Standard API pricing of $5/$30 per MTok is competitive for typical workloads; Pro at $30/$180 is positioned against the most capable frontier models for research and scientific work. Six weeks after GPT-5.4 shipped, the pace of retrained frontier models from OpenAI is accelerating.

Who is it for?

Developers building agentic coding pipelines; knowledge workers using ChatGPT for complex multi-step tasks; researchers needing strong coding and computer-use capabilities

Try it

ChatGPT: available now to Plus/Pro/Business/Enterprise. API: model=gpt-5.5 or gpt-5.5-pro