AI/TLDR

GPT-5.1

OpenAI's warmer, more conversational GPT-5 update — adds adaptive reasoning that spends fewer tokens on easy tasks, a no-reasoning mode, and shell + apply_patch coding tools.

Overview

GPT-5.1 is OpenAI's flagship large language model in the GPT-5 (Flagship / Thinking) line, released on November 12, 2025 in ChatGPT and rolled out to the API for developers on November 13. It ships as two models: GPT-5.1 Instant — a warmer, more conversational default tuned for everyday chat and quick coding — and GPT-5.1 Thinking, a reasoning model aimed at harder, multi-step problems. It positions itself as a smarter, friendlier successor to GPT-5 rather than a clean-sheet model.

Its headline change is adaptive reasoning: GPT-5.1 decides how much time to spend thinking based on how hard the task looks, spending fewer tokens on straightforward prompts and more on complex ones. OpenAI reports it runs roughly 2-3x faster than GPT-5 on simple tasks while using far fewer tokens, and it adds a new 'none' reasoning-effort option that makes the model respond like a non-reasoning model for latency-sensitive use. ChatGPT also gained eight selectable personality presets and a warmer default tone.

For developers, GPT-5.1 carries a 400,000-token context window, up to 128,000 output tokens, a September 30, 2024 knowledge cutoff, and text-plus-image (vision) input. The API release adds two new built-in tools — apply_patch for structured, reliable code edits and a shell tool for proposing shell commands a developer runs — plus extended prompt caching that retains prompts for up to 24 hours. It is priced at $1.25 per million input tokens and $10 per million output tokens, with cached input at $0.125 per million.

Released2025-11-12
LicenseProprietary
WeightsAPI only
ParametersUndisclosed
Context400K
Max output128K
ArchitectureProprietary transformer with a new adaptive reasoning framework: GPT-5.1 dynamically scales how many internal reasoning tokens it spends with task difficulty, so simple prompts return fast and cheap while hard ones get deeper thinking. Shipped as two variants — GPT-5.1 Instant (warm, fast, default) and GPT-5.1 Thinking (reasoning model for complex work) — plus a 'none' reasoning-effort setting that makes the model behave like a non-reasoning model for latency-sensitive use.
Knowledge cutoffSeptember 30, 2024
ModalitiesText, Vision
StatusGenerally available

Benchmarks

  1. SWE-bench Verified — real-world software engineering76.3%
  2. GPQA Diamond — PhD-level science reasoning88.1%
  3. AIME 2025 — competition math94%
  4. SWE-bench Pro — harder multi-file engineering50.8%
  5. Artificial Analysis Intelligence Index (GPT-5.1 high)39

Scores on a 0–100 scale (25-point gridlines); higher is better. Each benchmark links to its published source.

Pricing

Input$1.25 / 1M tokens per 1M tokens
Cached input$0.125 / 1M tokens per 1M tokens
Output$10.00 / 1M tokens per 1M tokens

Standard API pricing for the gpt-5.1 model. Cached input is a 90% discount on input; extended prompt caching retains prompts for up to 24 hours. Reasoning tokens in Thinking mode are billed as output tokens.

Pricing source ↗

Strengths

  • Adaptive reasoning that spends fewer thinking tokens on easy prompts and more on hard ones — roughly 2-3x faster than GPT-5 on simple tasks at lower token cost
  • Two complementary variants from one line: GPT-5.1 Instant (warm, fast, default) and GPT-5.1 Thinking (deeper reasoning for complex problems)
  • A 'none' reasoning-effort setting that turns off thinking entirely for latency-sensitive, interactive use
  • Strong coding and agentic results — 76.3% on SWE-bench Verified — plus new apply_patch and shell tools that tighten the edit-and-run loop
  • Frontier science reasoning: 88.1% on GPQA Diamond and 94.0% on AIME 2025
  • Large 400K-token context window with up to 24-hour extended prompt caching for cheaper, faster follow-ups
  • Vision input (text + images) for screenshots, charts, diagrams, and document pages
  • Warmer, more steerable ChatGPT personality with eight selectable presets

Best for

  • Everyday conversational assistance and writing where a warmer, faster default tone matters
  • Agentic coding workflows that edit files (apply_patch) and propose shell commands for a developer to run
  • Complex reasoning, math, and science tasks routed to GPT-5.1 Thinking
  • Latency-sensitive interactive apps using Instant with reasoning_effort set to 'none'
  • Long-document and large-codebase analysis within the 400K-token context window
  • Vision-assisted steps that read screenshots, charts, scanned pages, and diagrams
  • Cost-sensitive high-volume API workloads that benefit from adaptive token spend and 24-hour prompt caching

How to access

ProviderModel ID
OpenAI API ↗gpt-5.1 / gpt-5.1-chat-latest
OpenRouter ↗openai/gpt-5.1

GPT (Flagship / Thinking) — every version

The full lineage of the GPT (Flagship / Thinking) line, newest first. Every version has its own page — click any to compare specs, benchmarks and pricing.

VersionReleasedContextLicense
GPT-5.5current2026-04-231.05MProprietary
GPT-5.42026-03-05Proprietary
GPT-5.22025-12-11Proprietary
GPT-5.12025-11-12Proprietary
GPT-52025-08-07Proprietary
GPT-4o2024-05-13Proprietary

FAQ

What is GPT-5.1?

GPT-5.1 is OpenAI's flagship large language model, released November 12, 2025 (and to developers on November 13). It comes in two variants — GPT-5.1 Instant, a warmer and faster default, and GPT-5.1 Thinking, a reasoning model for complex tasks — and its main new feature is adaptive reasoning that scales how much the model thinks with the difficulty of the task.

How is GPT-5.1 different from GPT-5?

GPT-5.1 has a warmer, more conversational default tone, eight selectable personality presets in ChatGPT, and adaptive reasoning that makes it roughly 2-3x faster than GPT-5 on simple tasks while using fewer tokens. It also adds a 'none' reasoning-effort option, new apply_patch and shell tools for coding, and extended prompt caching of up to 24 hours.

How much does GPT-5.1 cost?

Per OpenAI's API pricing, GPT-5.1 is $1.25 per million input tokens and $10.00 per million output tokens, with cached input at $0.125 per million (a 90% discount). Reasoning tokens used by GPT-5.1 Thinking are billed as output tokens.

What is GPT-5.1's context window and knowledge cutoff?

GPT-5.1 has a 400,000-token context window with up to 128,000 output tokens, a knowledge cutoff of September 30, 2024, and accepts text plus image (vision) input. Audio input is not supported.