GPT-5.1

Name: GPT-5.1
Author: OpenAI

OpenAI's warmer, more conversational GPT-5 update — adds adaptive reasoning that spends fewer tokens on easy tasks, a no-reasoning mode, and shell + apply_patch coding tools.

Overview

GPT-5.1 is OpenAI's flagship large language model in the GPT-5 (Flagship / Thinking) line, released on November 12, 2025 in ChatGPT and rolled out to the API for developers on November 13. It ships as two models: GPT-5.1 Instant — a warmer, more conversational default tuned for everyday chat and quick coding — and GPT-5.1 Thinking, a reasoning model aimed at harder, multi-step problems. It positions itself as a smarter, friendlier successor to GPT-5 rather than a clean-sheet model.

Its headline change is adaptive reasoning: GPT-5.1 decides how much time to spend thinking based on how hard the task looks, spending fewer tokens on straightforward prompts and more on complex ones. OpenAI reports it runs roughly 2-3x faster than GPT-5 on simple tasks while using far fewer tokens, and it adds a new 'none' reasoning-effort option that makes the model respond like a non-reasoning model for latency-sensitive use. ChatGPT also gained eight selectable personality presets and a warmer default tone.

For developers, GPT-5.1 carries a 400,000-token context window, up to 128,000 output tokens, a September 30, 2024 knowledge cutoff, and text-plus-image (vision) input. The API release adds two new built-in tools — apply_patch for structured, reliable code edits and a shell tool for proposing shell commands a developer runs — plus extended prompt caching that retains prompts for up to 24 hours. It is priced at $1.25 per million input tokens and $10 per million output tokens, with cached input at $0.125 per million.

Released	2025-11-12
License	Proprietary
Weights	API only
Parameters	Undisclosed
Context	400K
Max output	128K
Architecture	Proprietary transformer with a new adaptive reasoning framework: GPT-5.1 dynamically scales how many internal reasoning tokens it spends with task difficulty, so simple prompts return fast and cheap while hard ones get deeper thinking. Shipped as two variants — GPT-5.1 Instant (warm, fast, default) and GPT-5.1 Thinking (reasoning model for complex work) — plus a 'none' reasoning-effort setting that makes the model behave like a non-reasoning model for latency-sensitive use.
Knowledge cutoff	September 30, 2024
Modalities	Text, Vision
Status	Generally available

Benchmarks

Scores on a 0–100 scale (25-point gridlines); higher is better. Each benchmark links to its published source.

Pricing

Input	$1.25 / 1M tokens per 1M tokens
Cached input	$0.125 / 1M tokens per 1M tokens
Output	$10.00 / 1M tokens per 1M tokens

Standard API pricing for the gpt-5.1 model. Cached input is a 90% discount on input; extended prompt caching retains prompts for up to 24 hours. Reasoning tokens in Thinking mode are billed as output tokens.

Pricing source ↗

Strengths

Adaptive reasoning that spends fewer thinking tokens on easy prompts and more on hard ones — roughly 2-3x faster than GPT-5 on simple tasks at lower token cost
Two complementary variants from one line: GPT-5.1 Instant (warm, fast, default) and GPT-5.1 Thinking (deeper reasoning for complex problems)
A 'none' reasoning-effort setting that turns off thinking entirely for latency-sensitive, interactive use
Strong coding and agentic results — 76.3% on SWE-bench Verified — plus new apply_patch and shell tools that tighten the edit-and-run loop
Frontier science reasoning: 88.1% on GPQA Diamond and 94.0% on AIME 2025
Large 400K-token context window with up to 24-hour extended prompt caching for cheaper, faster follow-ups
Vision input (text + images) for screenshots, charts, diagrams, and document pages
Warmer, more steerable ChatGPT personality with eight selectable presets

Best for

Everyday conversational assistance and writing where a warmer, faster default tone matters
Agentic coding workflows that edit files (apply_patch) and propose shell commands for a developer to run
Complex reasoning, math, and science tasks routed to GPT-5.1 Thinking
Latency-sensitive interactive apps using Instant with reasoning_effort set to 'none'
Long-document and large-codebase analysis within the 400K-token context window
Vision-assisted steps that read screenshots, charts, scanned pages, and diagrams
Cost-sensitive high-volume API workloads that benefit from adaptive token spend and 24-hour prompt caching

How to access

Provider	Model ID
OpenAI API ↗	`gpt-5.1 / gpt-5.1-chat-latest`
OpenRouter ↗	`openai/gpt-5.1`

GPT (Flagship / Thinking) — every version

The full lineage of the GPT (Flagship / Thinking) line, newest first. Every version has its own page — click any to compare specs, benchmarks and pricing.

Version	Released	Context	License
GPT-5.5current	2026-04-23	1.05M	Proprietary
GPT-5.4	2026-03-05	—	Proprietary
GPT-5.2	2025-12-11	—	Proprietary
GPT-5.1	2025-11-12	—	Proprietary
GPT-5	2025-08-07	—	Proprietary
GPT-4o	2024-05-13	—	Proprietary

FAQ

What is GPT-5.1?

GPT-5.1 is OpenAI's flagship large language model, released November 12, 2025 (and to developers on November 13). It comes in two variants — GPT-5.1 Instant, a warmer and faster default, and GPT-5.1 Thinking, a reasoning model for complex tasks — and its main new feature is adaptive reasoning that scales how much the model thinks with the difficulty of the task.

How is GPT-5.1 different from GPT-5?

GPT-5.1 has a warmer, more conversational default tone, eight selectable personality presets in ChatGPT, and adaptive reasoning that makes it roughly 2-3x faster than GPT-5 on simple tasks while using fewer tokens. It also adds a 'none' reasoning-effort option, new apply_patch and shell tools for coding, and extended prompt caching of up to 24 hours.

How much does GPT-5.1 cost?

Per OpenAI's API pricing, GPT-5.1 is $1.25 per million input tokens and $10.00 per million output tokens, with cached input at $0.125 per million (a 90% discount). Reasoning tokens used by GPT-5.1 Thinking are billed as output tokens.

What is GPT-5.1's context window and knowledge cutoff?

GPT-5.1 has a 400,000-token context window with up to 128,000 output tokens, a knowledge cutoff of September 30, 2024, and accepts text plus image (vision) input. Audio input is not supported.

// Overview

// Benchmarks

// Pricing

// Strengths

// Best for

// How to access

// GPT (Flagship / Thinking) — every version

// FAQ