Grok 1

xAI's first model: a 314B-parameter Mixture-of-Experts base model, later open-sourced under Apache 2.0.

Overview

Grok 1 is the first model from xAI, Elon Musk's AI company, announced on November 3, 2023. xAI presented it as a very early beta — in their own words, "the best we could do with 2 months of training" — and seeded it to a small set of US users before rolling it into the X (formerly Twitter) Premium+ subscription tier. Grok 1 was the engine behind the original Grok chatbot, pitched as a witty assistant with real-time access to information on X.

Under the hood, Grok 1 is a large Mixture-of-Experts (MoE) language model with 314 billion total parameters. For any given token it routes through 2 of its 8 expert networks, so only about a quarter of the weights are active at inference time. It has a 64-layer Transformer architecture, an 8,192-token context window, and a pre-training data cutoff of October 2023. At launch, xAI reported that Grok 1 outperformed GPT-3.5 and Llama 2 70B on standard benchmarks while trailing larger frontier models like GPT-4.

On March 17, 2024, xAI open-sourced Grok 1, releasing the network architecture and full weight parameters under the permissive Apache 2.0 license on GitHub and Hugging Face. At 314B parameters it was, at the time, the largest open-weights MoE model publicly available. The released checkpoint is the unrefined base model from pre-training, not the chat-tuned version that ran in the product, so running it requires substantial GPU memory and additional fine-tuning to be useful as an assistant.

Released	2023-11-03
License	Apache-2.0
Weights	Open weights
Parameters	314B total (Mixture-of-Experts; 2 of 8 experts active per token, ~25% of weights)
Context	8,192 tokens
Max output	Not separately specified by xAI (shares the 8,192-token sequence length)
Architecture	Decoder-only Transformer with a Mixture-of-Experts (MoE) feed-forward layer. 314B total parameters across 8 experts, with 2 experts (roughly 25% of weights) activated per token. 64 layers; 48 attention heads for queries and 8 for keys/values; 6,144-dimensional embeddings; rotary position embeddings (RoPE); SentencePiece tokenizer with a 131,072-token vocabulary. The open-weights release is the raw pre-training base checkpoint — it was not fine-tuned for chat, dialogue, or any specific application and had no RLHF/safety tuning applied.
Knowledge cutoff	October 2023
Modalities	Text
Status	Superseded — succeeded by Grok 1.5 (May 2024) and later Grok versions. The base weights remain available as an open-weights release under Apache 2.0; the hosted chatbot has long since moved to newer models.

Benchmarks

Scores on a 0–100 scale (25-point gridlines); higher is better. Each benchmark links to its published source.

Strengths

Open weights under a fully permissive Apache 2.0 license — free to use, modify, and self-host
Very large 314B-parameter Mixture-of-Experts design, the largest open MoE model when released
Sparse activation (only ~25% of weights per token) keeps inference cheaper than a dense 314B model would be
Strong-for-its-era results: beat GPT-3.5 and Llama 2 70B on several public benchmarks at launch
Full architecture and JAX/Rust reference inference code published, useful for research and study

Best for

Research into large-scale Mixture-of-Experts architectures and sparse inference
A base model for teams that want to fine-tune their own assistant on permissively-licensed weights
Historical and educational study of xAI's first-generation model
Self-hosted experimentation where data must stay on-premises and an open license is required

How to access

Provider	Model ID
xAI (open weights, GitHub) ↗	`grok-1`
Hugging Face (open weights) ↗	`xai-org/grok-1`

Grok (flagship) — every version

The full lineage of the Grok (flagship) line, newest first. Every version has its own page — click any to compare specs, benchmarks and pricing.

Version	Released	Context	License
Grok 4.3current	2026-04-30	1M	Proprietary
Grok 4.20	2026-03	—	Proprietary
Grok 4.1	2025-11-17	—	Proprietary
Grok 4	2025-07-09	—	Proprietary
Grok 3	2025-02-17	—	Proprietary
Grok 2	2024-08-20	—	Open weights
Grok 1.5	2024-05-15	—	Proprietary
Grok 1	2023-11-03	—	Apache-2.0

FAQ

When was Grok 1 released, and is it still in use?

xAI announced Grok 1 on November 3, 2023 as an early beta, available through X Premium+. It has since been superseded by Grok 1.5 (May 2024) and later Grok models. The base weights remain freely available as an open-weights release, but the hosted Grok chatbot now runs on much newer versions.

Are Grok 1's weights really open source?

Yes. On March 17, 2024, xAI published the full network architecture and weight parameters on GitHub and Hugging Face under the Apache 2.0 license, making it free to use, modify, and self-host. At 314B parameters it was the largest open Mixture-of-Experts model available at the time.

How big is Grok 1 and what architecture does it use?

Grok 1 is a Mixture-of-Experts (MoE) Transformer with 314 billion total parameters. It has 8 experts and activates 2 of them (about 25% of the weights) for each token. It uses 64 layers, an 8,192-token context window, rotary position embeddings, and a 131,072-token SentencePiece vocabulary.

Can I call Grok 1 through an API?

There was never an official xAI pay-as-you-go API for Grok 1. It was originally accessible only via an X Premium+ subscription, and the open-weights release is meant to be downloaded and self-hosted. Note that the released checkpoint is the raw pre-training base model — it was not chat- or instruction-tuned, so it needs fine-tuning to behave like an assistant.

// Overview

// Benchmarks

// Strengths

// Best for

// How to access

// Grok (flagship) — every version

// FAQ