Grok 4

xAI's frontier reasoning model, RL-trained for native tool use and real-time X search, with a Grok 4 Heavy multi-agent tier.

Overview

Grok 4 is xAI's flagship large language model, released July 9, 2025 as the company's first model positioned as a frontier reasoning system. It is the successor to Grok 3 and the foundation of the later Grok 4.1 and Grok 4.3 releases. Grok 4 was trained with roughly 10x more reinforcement-learning compute than Grok 3, using xAI's Colossus cluster of about 200,000 GPUs, and reinforcement learning was applied at pretraining scale to make tool use a first-class capability rather than an add-on.

What sets Grok 4 apart is native, RL-trained tool use: the model reasons while calling a code interpreter and searching the web and X (formerly Twitter) in real time, then folds the results back into its answer. It accepts text and image input and returns text. xAI shipped it in two forms — standard Grok 4 (a single reasoning agent) and Grok 4 Heavy, a multi-agent tier in which several copies of the model work a problem in parallel and reconcile their answers for higher accuracy on the hardest tasks.

At launch Grok 4 posted frontier scores across reasoning, math, and science benchmarks, and Grok 4 Heavy was the first system to clear 50% on Humanity's Last Exam (text-only subset). The standard model is served via the xAI API as grok-4-0709 with a 256K-token context window (128K in the Grok consumer app). It has since been superseded by cheaper, faster Grok 4.1 and 4.3 releases but remains accessible as a legacy model.

Released	2025-07-09
License	Proprietary
Weights	API only
Context	256K
Max output	8K
Architecture	Built on the sixth generation of xAI's foundation model and trained with roughly 10x more reinforcement-learning compute than Grok 3 on the Colossus 200,000-GPU cluster. xAI ran RL at pretraining scale to teach Grok 4 to natively use tools (code execution and web/X search) while reasoning. The companion Grok 4 Heavy tier runs multiple agents in parallel on the same model and compares their work to reach an answer. Exact parameter count is not disclosed by xAI.
Knowledge cutoff	November 2024
Modalities	Text, Vision
Status	Available (legacy; superseded by Grok 4.1/4.3)

Benchmarks

Scores on a 0–100 scale (25-point gridlines); higher is better. Each benchmark links to its published source.

Pricing

Input	$3.00 / 1M tokens per 1M tokens
Output	$15.00 / 1M tokens per 1M tokens

xAI list price for the grok-4-0709 model. The standard Grok 4 tier; the Grok 4 Heavy multi-agent tier is offered through the SuperGrok Heavy consumer plan ($300/month).

Pricing source ↗

Strengths

Frontier reasoning, math, and graduate-level science performance at launch (GPQA 87.5%, AIME 2025 91.7%, HMMT 2025 90.0%)
Native, reinforcement-learning-trained tool use — code execution plus real-time web and X search woven into its reasoning
Strong abstract-reasoning generalization: first model to break the single-digit wall on ARC-AGI-2 (15.9%, independently verified by the ARC Prize Foundation)
Grok 4 Heavy multi-agent tier pushes the hardest benchmarks higher (50.7% on Humanity's Last Exam, 100% on AIME 2025)
Large 256K-token API context window for long documents and extended agentic sessions
Vision (image) input alongside text

Best for

Hard STEM and competition-math problem solving (AIME/HMMT/USAMO-style reasoning)
Graduate-level science Q&A and research assistance
Agentic workflows that need a model to plan, run code, and search the web/X mid-reasoning
Coding and code review aided by tool use and a large context window
Long-document analysis using the 256K-token context window
Research-grade tasks where Grok 4 Heavy's parallel multi-agent accuracy is worth the higher cost

How to access

Provider	Model ID
xAI ↗	`grok-4-0709`

Grok (flagship) — every version

The full lineage of the Grok (flagship) line, newest first. Every version has its own page — click any to compare specs, benchmarks and pricing.

Version	Released	Context	License
Grok 4.3current	2026-04-30	1M	Proprietary
Grok 4.20	2026-03	—	Proprietary
Grok 4.1	2025-11-17	—	Proprietary
Grok 4	2025-07-09	—	Proprietary
Grok 3	2025-02-17	—	Proprietary
Grok 2	2024-08-20	—	Open weights
Grok 1.5	2024-05-15	—	Proprietary
Grok 1	2023-11-03	—	Apache-2.0

FAQ

When was Grok 4 released?

xAI released Grok 4 on July 9, 2025, announcing it in a livestream alongside the Grok 4 Heavy multi-agent tier. The API model id is grok-4-0709.

What is the difference between Grok 4 and Grok 4 Heavy?

Standard Grok 4 is a single reasoning agent. Grok 4 Heavy runs several copies of the model in parallel on the same problem and reconciles their answers, which raises accuracy on the hardest benchmarks — for example it reached 50.7% on Humanity's Last Exam and 100% on AIME 2025, versus 41.0% and 91.7% for standard Grok 4. Heavy is offered through xAI's SuperGrok Heavy plan.

How much does Grok 4 cost via the API?

xAI lists the standard Grok 4 model (grok-4-0709) at $3.00 per million input tokens and $15.00 per million output tokens. Newer Grok 4.1 and 4.3 releases are cheaper, so Grok 4 is now a legacy option.

What is Grok 4's context window and knowledge cutoff?

Grok 4 has a 256K-token context window in the API (128K in the Grok consumer app) and a maximum output of about 8K tokens. xAI's documentation lists a knowledge cutoff of November 2024. It accepts text and image input and returns text.

// Overview

// Benchmarks

// Pricing

// Strengths

// Best for

// How to access

// Grok (flagship) — every version

// FAQ