Grok 4.1

Conversational refresh of the Grok flagship: warmer emotional intelligence, sharper creative writing, and a near 3x drop in hallucinations.

Overview

Grok 4.1 is xAI's flagship chat-and-reasoning model, released on 17 November 2025 as the new default Grok across grok.com, X, and the iOS and Android apps. Rather than chasing raw exam scores, this release targets everyday usefulness: xAI tuned it for higher emotional intelligence, more natural creative writing, and far fewer made-up facts. It ships in two modes — a default fast mode (codename 'tensor') and a deeper 'Thinking' reasoning mode (codename 'quasarflux').

The headline change is reliability. xAI reports Grok 4.1's hallucination rate fell from 12.09% to 4.22%, and its FActScore error rate dropped from 9.89% to 2.97% — both roughly a threefold improvement over the prior generation. xAI ran a quiet two-week rollout (1-14 November 2025) in which blind A/B testers preferred Grok 4.1's answers 64.78% of the time, and both modes posted top placements on the LMArena Text Arena and EQ-Bench3 emotional-intelligence leaderboards at launch.

Grok 4.1 launched as a consumer model first: it was available free (with rate limits) and removed from limits on SuperGrok, but was not exposed through the standard xAI developer API at release. Developers were instead pointed to the separate, lower-cost Grok 4.1 Fast model and Agent Tools API two days later. Grok 4.1 carries a 256K-token context window and an ~8K-token-per-response ceiling, with a knowledge cutoff around November 2024.

Released	2025-11-17
License	Proprietary
Weights	API only
Parameters	Undisclosed
Context	256K
Max output	~8K tokens / response
Architecture	Undisclosed
Knowledge cutoff	November 2024
Modalities	Text
Status	Generally available

Benchmarks

Scores on a 0–100 scale (25-point gridlines); higher is better. Each benchmark links to its published source.

Strengths

Sharply lower hallucination rate — xAI reports a drop from 12.09% to 4.22% versus the prior generation
Improved factual accuracy on FActScore (error rate 9.89% to 2.97%)
Strong emotional intelligence and empathy, topping the EQ-Bench3 roleplay leaderboard at launch
More coherent, less formulaic creative and long-form writing
Available free to all users (with rate limits) on web, X, and mobile
Optional 'Thinking' mode for harder, multi-step reasoning tasks

Best for

Everyday conversational assistance and brainstorming where tone and empathy matter
Creative and long-form writing — stories, scripts, marketing copy
Drafting and editing where factual reliability is important
Roleplay and emotionally-aware companion interactions
Research and Q&A inside its 256K-token context window
Real-time discussion of current topics surfaced through Grok on X

Grok (flagship) — every version

The full lineage of the Grok (flagship) line, newest first. Every version has its own page — click any to compare specs, benchmarks and pricing.

Version	Released	Context	License
Grok 4.3current	2026-04-30	1M	Proprietary
Grok 4.20	2026-03	—	Proprietary
Grok 4.1	2025-11-17	—	Proprietary
Grok 4	2025-07-09	—	Proprietary
Grok 3	2025-02-17	—	Proprietary
Grok 2	2024-08-20	—	Open weights
Grok 1.5	2024-05-15	—	Proprietary
Grok 1	2023-11-03	—	Apache-2.0

FAQ

When was Grok 4.1 released?

xAI released Grok 4.1 on 17 November 2025, after a quiet two-week rollout from 1-14 November in which it was A/B tested against the previous Grok. It became the default model on grok.com, X, and the Grok mobile apps.

What is the biggest improvement in Grok 4.1?

Reliability. xAI reports the hallucination rate fell from 12.09% to 4.22% and the FActScore error rate dropped from 9.89% to 2.97% — both roughly threefold improvements. The model is also tuned for stronger emotional intelligence and more natural creative writing.

Can I use Grok 4.1 through the API?

Not the flagship consumer model. At launch Grok 4.1 was available only in the consumer apps (grok.com, X, iOS, Android). For developer/API access, xAI released a separate, lower-cost model called Grok 4.1 Fast — alongside an Agent Tools API — two days later.

What is Grok 4.1's context window and knowledge cutoff?

The flagship Grok 4.1 model has a 256K-token context window and caps each reply at roughly 8K tokens. Its training data runs to around November 2024. (The separate Grok 4.1 Fast variant offers a larger context window.)

// Overview

// Benchmarks

// Strengths

// Best for

// Grok (flagship) — every version

// FAQ

Overview

Benchmarks

Strengths

Best for

Grok (flagship) — every version

FAQ