AI/TLDR

GPT-5.3 Instant

OpenAI's fast default ChatGPT model, tuned for fewer refusals, fewer hallucinations, and a less preachy tone.

Overview

GPT-5.3 Instant is the fast, low-latency model OpenAI shipped on March 3, 2026 as the everyday default in ChatGPT, and exposed to developers through the API as gpt-5.3-chat-latest. It is the "Instant" tier of the GPT-5 line: it answers immediately rather than running an extended reasoning pass, which keeps time-to-first-token low for chat, drafting, and search-grounded tasks.

The headline change in GPT-5.3 Instant is accuracy and tone rather than raw capability. OpenAI reported that it cuts hallucinations by 26.8% on higher-stakes questions (medicine, law, finance) when it can use the web, and by 19.7% when relying on internal knowledge alone; on user-flagged factual-error conversations the reductions were 22.5% (with web) and 9.6% (without). On the SimpleQA factuality benchmark, OpenAI cited a drop in error rate from 8.4% to 6.1%. GPT-5.3 Instant also reduces unnecessary refusals and trims the moralizing preambles, filler, and over-enthusiastic phrasing that critics nicknamed the "cringe" tone.

GPT-5.3 Instant carries a 128K-token context window, a 16,384-token max output, and an August 31, 2025 knowledge cutoff, and it accepts text and image input. It was the ChatGPT default until GPT-5.5 Instant replaced it on May 5, 2026; OpenAI has since deprecated the gpt-5.3-chat-latest snapshot in the API and recommends GPT-5.5 for most new usage.

Released2026-03-03
LicenseProprietary
WeightsAPI only
ParametersUndisclosed
Context128K
Max output16,384 tokens
ArchitectureClosed/undisclosed. GPT-5.3 Instant is the non-reasoning "Instant" tier of OpenAI's GPT-5 family — a fast, high-throughput model that answers directly without an extended thinking phase. OpenAI has not published a parameter count or architectural details.
Knowledge cutoff2025-08-31
ModalitiesText, Vision
StatusDeprecated

Benchmarks

  1. SimpleQA (factuality accuracy)50%
  2. MMLU-Pro81%

Scores on a 0–100 scale (25-point gridlines); higher is better. Each benchmark links to its published source.

Pricing

Input$1.75 / 1M tokens per 1M tokens
Cached input$0.175 / 1M tokens per 1M tokens
Output$14.00 / 1M tokens per 1M tokens

API pricing for gpt-5.3-chat-latest. Now deprecated; OpenAI recommends GPT-5.5 for most API usage.

Pricing source ↗

Strengths

  • Low latency: answers directly without an extended reasoning phase, keeping time-to-first-token fast for chat and search-grounded replies
  • Markedly fewer hallucinations than GPT-5.2 Instant — OpenAI reported 26.8% fewer on high-stakes questions with web access and 19.7% fewer without
  • Fewer unnecessary refusals and a cleaner, less moralizing tone for sensitive or everyday work tasks
  • Better at balancing fresh web-search results against its own knowledge when answering
  • Text + vision input with a 128K-token context window

Best for

  • Everyday ChatGPT conversation and quick question-answering
  • Drafting customer responses, internal communications, and summaries
  • Web-grounded research where up-to-date, well-contextualized answers matter
  • Guidance on sensitive topics where over-cautious refusals previously got in the way
  • High-throughput API workloads that need low latency rather than deep step-by-step reasoning

How to access

ProviderModel ID
OpenAI ↗gpt-5.3-chat-latest
OpenRouter ↗openai/gpt-5.3-chat

GPT Instant — every version

The full lineage of the GPT Instant line, newest first. Every version has its own page — click any to compare specs, benchmarks and pricing.

VersionReleasedContextLicense
GPT-5.5 Instantcurrent2026-05-05Proprietary
GPT-5.3 Instant2026-03-03Proprietary
GPT-5.2 Instant2025-12-11Proprietary
GPT-5.1 Instant2025-11-12Proprietary

FAQ

When was GPT-5.3 Instant released?

OpenAI released GPT-5.3 Instant on March 3, 2026, rolling it out to all ChatGPT users and to developers via the API as gpt-5.3-chat-latest. It served as the ChatGPT default until GPT-5.5 Instant replaced it on May 5, 2026.

What is the GPT-5.3 Instant API model ID and price?

The API model ID is gpt-5.3-chat-latest. OpenAI listed it at $1.75 per million input tokens, $0.175 per million cached-input tokens, and $14.00 per million output tokens. The model is now deprecated, and OpenAI recommends GPT-5.5 for most API usage.

What is GPT-5.3 Instant's context window and knowledge cutoff?

GPT-5.3 Instant has a 128K-token context window, a maximum output of 16,384 tokens, and a knowledge cutoff of August 31, 2025. It accepts text and image input.

How is GPT-5.3 Instant different from GPT-5.2 Instant?

GPT-5.3 Instant focuses on accuracy and tone rather than new capabilities. OpenAI reported 26.8% fewer hallucinations on high-stakes questions with web access (19.7% without), fewer unnecessary refusals, and a less preachy, less filler-heavy conversational style — the so-called "anti-cringe" update.