AI/TLDR

Grok 4.20

xAI's multi-agent flagship: a council of four specialist agents that debate, fact-check, and synthesize every answer.

Overview

Grok 4.20 is the March 2026 flagship from xAI, the model that first moved the Grok line from a single reasoning model to an explicit multi-agent design. Instead of one model answering, Grok 4.20 fields a small council of specialized agents that reason in parallel, debate each other, and combine their work into a single response. It launched in public beta on February 17, 2026, and reached the xAI API in March 2026 under the model IDs grok-4.20-0309-reasoning and grok-4.20-0309-non-reasoning, plus a dedicated multi-agent variant.

Grok 4.20 takes text and image input and returns text, with a 1,000,000-token context window and native function calling and structured-output support. xAI ships separate reasoning and non-reasoning variants so callers can trade depth for speed and cost: the reasoning variant for hard math, science, and multi-step agentic work, the non-reasoning variant for routine Q&A and high-throughput retrieval. The model's knowledge cutoff is November 2024.

Grok 4.20 is no longer xAI's newest flagship. It was followed by Grok 4.3 (April 2026), and sits in the line after Grok 4.1 (November 2025) and Grok 4 (July 2025). xAI has not published the model's parameter count, which is typical for its proprietary releases.

Released2026-03
LicenseProprietary
WeightsAPI only
Context1M
ArchitectureProprietary multi-agent system. Where earlier Grok releases were a single model (optionally with a "thinking" pass), Grok 4.20 runs a council of four specialized agents that think in parallel, debate, fact-check, and synthesize a final answer on each turn. xAI offers it in reasoning, non-reasoning, and multi-agent variants. The underlying model size and parameter count are not disclosed.
Knowledge cutoffNovember 2024
ModalitiesText, Vision
StatusSuperseded

Benchmarks

  1. Artificial Analysis Intelligence Index37index
  2. GPQA Diamond88.5%
  3. Humanity's Last Exam30%
  4. SciCode44.7%
  5. Terminal-Bench Hard40.9%
  6. τ²-Bench96.5%

Scores on a 0–100 scale (25-point gridlines); higher is better. Each benchmark links to its published source.

Pricing

Input$1.25 / 1M tokens per 1M tokens
Cached input$0.20 / 1M tokens per 1M tokens
Output$2.50 / 1M tokens per 1M tokens

Same pricing across the reasoning, non-reasoning, and multi-agent variants.

Pricing source ↗

Strengths

  • Multi-agent answering: four specialist agents debate and fact-check before responding, aimed at reducing single-pass mistakes on hard problems
  • Strong graduate-level science reasoning (88.5% on GPQA Diamond, reasoning variant)
  • Large 1M-token context window for long documents and codebases
  • Reasoning and non-reasoning variants let you tune for depth vs. speed and cost at the same price
  • Native function calling and structured outputs for agentic and tool-using workflows
  • Vision input (text plus images) alongside text generation

Best for

  • Complex math, science, and multi-step analytical problems that benefit from the reasoning variant
  • Agentic tool-calling and automation pipelines using native function calling and structured outputs
  • Long-document and large-codebase tasks that exploit the 1M-token context window
  • High-throughput, latency-sensitive Q&A and retrieval using the non-reasoning variant
  • Image-plus-text understanding tasks (vision input with text output)

How to access

ProviderModel ID
xAI ↗grok-4.20-0309-reasoning
xAI ↗grok-4.20-0309-non-reasoning
Oracle Cloud (OCI Generative AI) ↗xai.grok-4.20-0309-reasoning

Grok (flagship) — every version

The full lineage of the Grok (flagship) line, newest first. Every version has its own page — click any to compare specs, benchmarks and pricing.

VersionReleasedContextLicense
Grok 4.3current2026-04-301MProprietary
Grok 4.202026-03Proprietary
Grok 4.12025-11-17Proprietary
Grok 42025-07-09Proprietary
Grok 32025-02-17Proprietary
Grok 22024-08-20Open weights
Grok 1.52024-05-15Proprietary
Grok 12023-11-03Apache-2.0

FAQ

What is Grok 4.20?

Grok 4.20 is xAI's flagship large language model released in March 2026 (public beta February 17, 2026). It was the first Grok release built as a multi-agent system: a council of four specialized agents that reason in parallel, debate, and synthesize a single answer, rather than one model responding alone.

How much does Grok 4.20 cost?

Per xAI's developer docs, Grok 4.20 is priced at $1.25 per 1M input tokens, $0.20 per 1M cached input tokens, and $2.50 per 1M output tokens. The reasoning, non-reasoning, and multi-agent variants share the same pricing.

What is Grok 4.20's context window and what inputs does it accept?

Grok 4.20 has a 1,000,000-token (1M) context window. It accepts text and image input and returns text, and supports native function calling and structured outputs. Its knowledge cutoff is November 2024.

Is Grok 4.20 still xAI's latest model?

No. Grok 4.20 has been superseded by Grok 4.3 (April 2026). It remains available via the xAI API and Oracle Cloud, but it is no longer xAI's newest flagship in the Grok line.