AI/TLDR

Llama 3

Meta's first Llama 3 release: open-weight 8B and 70B text models trained on 15 trillion tokens, with an 8K context window.

Overview

Llama 3 is Meta's third-generation open-weight large language model family, launched on April 18, 2024 in two dense sizes — Llama 3 8B and Llama 3 70B — each available as a pretrained base model and an instruction-tuned (Instruct) chat model. At release it was Meta's most capable openly available model and set a new bar for the small/mid open-weight tier, outperforming peers like Mistral 7B, Gemma and the earlier Llama 2 70B on standard reasoning and coding benchmarks.

Both Llama 3 sizes are text-only and use a decoder-only transformer with Grouped-Query Attention and a larger 128K-token tokenizer, trained on more than 15 trillion tokens of publicly available text. The context window is 8,192 tokens — modest by later standards but a doubling of Llama 2's 4K. Meta published the weights under the permissive Meta Llama 3 Community License, letting developers run, fine-tune and self-host the models, which is why Llama 3 became a default base for countless fine-tunes and on-prem deployments.

Llama 3 was a stepping stone: Meta explicitly framed it as an early release and shipped the much larger, longer-context Llama 3.1 (including the frontier-scale 405B) just three months later on July 23, 2024, followed by Llama 3.2 and Llama 3.3. For new projects the later Llama 3.x models are the recommended choice, but the original Llama 3 8B and 70B weights remain freely downloadable and historically important.

Released2024-04-18
LicenseMeta Llama 3 Community License (open weights; custom, not OSI-approved — free for most commercial use, with extra terms for products exceeding 700M monthly active users)
WeightsOpen weights
ParametersTwo dense sizes: 8B and 70B parameters
Context8,192 tokens (8K)
ArchitectureDecoder-only (auto-regressive) transformer using Grouped-Query Attention (GQA) on both the 8B and 70B sizes, with a 128K-token tokenizer vocabulary. Pretrained on over 15 trillion tokens from publicly available sources (about 7x the Llama 2 corpus and 4x the code); instruct models tuned with SFT, rejection sampling, PPO and DPO.
Knowledge cutoffMarch 2023 (8B); December 2023 (70B)
ModalitiesText
StatusSuperseded. Llama 3 (the original 8B and 70B from April 2024) was replaced by Llama 3.1 on 2024-07-23 and later point releases (3.2, 3.3). The weights remain freely downloadable on Hugging Face, but most hosted API providers have retired the original endpoints in favor of the newer Llama 3.x models.

Benchmarks

  1. MMLU (5-shot) — 70B Instruct82%
  2. HumanEval (0-shot) — 70B Instruct81.7%
  3. GSM-8K (8-shot, CoT) — 70B Instruct93%
  4. MATH (4-shot, CoT) — 70B Instruct50.4%
  5. GPQA (0-shot) — 70B Instruct39.5%
  6. MMLU (5-shot) — 8B Instruct68.4%
  7. HumanEval (0-shot) — 8B Instruct62.2%
  8. GSM-8K (8-shot, CoT) — 8B Instruct79.6%
  9. MATH (4-shot, CoT) — 8B Instruct30%
  10. GPQA (0-shot) — 8B Instruct34.2%

Scores on a 0–100 scale (25-point gridlines); higher is better. Each benchmark links to its published source.

Pricing

Input$0.65 / 1M tokens (Llama 3 70B Instruct, cross-provider median) / 1M tokens
Output$2.75 / 1M tokens (Llama 3 70B Instruct, cross-provider median) / 1M tokens

Llama 3 has open weights and no first-party Meta API, so price depends on the hosting provider; these are cross-provider median rates for the original Llama 3 70B Instruct. The smaller 8B was typically served around $0.10–0.20 / 1M tokens. Many providers have since retired the original Llama 3 endpoints in favor of Llama 3.1/3.3.

Pricing source ↗

Strengths

  • Strong reasoning and coding for its size — Llama 3 70B Instruct scores 82.0 on MMLU and 81.7 on HumanEval, competitive with much larger closed models of its era
  • Fully open weights under a permissive community license: free to download, fine-tune and self-host
  • Efficient 8B model that runs on a single consumer GPU while still scoring 68.4 MMLU and 62.2 HumanEval
  • Grouped-Query Attention on both sizes for faster, cheaper inference
  • Huge ecosystem — became one of the most fine-tuned and widely hosted open models, supported across virtually every inference framework and cloud

Best for

  • Self-hosted chat assistants and internal tools where data must stay on-premises
  • Fine-tuning a base model on domain-specific data for classification, extraction or instruction following
  • Cost-sensitive, high-volume text generation and summarization via the lightweight 8B model
  • Code generation and assistance using the strong-for-its-size 70B model
  • Research, benchmarking and as an open baseline for building on top of frontier-quality open weights

How to access

ProviderModel ID
Hugging Face (download weights) ↗meta-llama/Meta-Llama-3-70B-Instruct
Together AI (hosted) ↗meta-llama/Llama-3-70b-chat-hf

Llama 3 — every version

The full lineage of the Llama 3 line, newest first. Every version has its own page — click any to compare specs, benchmarks and pricing.

VersionReleasedContextLicense
Llama 3.3 70Bcurrent2024-12-06Open weights
Llama 3.22024-09-25Open weights
Llama 3.12024-07-23Open weights
Llama 32024-04-18Open weights

FAQ

When was Llama 3 released and by whom?

Meta released Llama 3 on April 18, 2024, in two open-weight sizes: 8B and 70B parameters, each with a pretrained base and an instruction-tuned chat variant.

What is Llama 3's context window?

Both Llama 3 sizes support an 8,192-token (8K) context window — double Llama 2's 4K. The longer 128K context arrived later with Llama 3.1.

Is Llama 3 free and open source?

The weights are freely downloadable under the Meta Llama 3 Community License, which permits most commercial use. It is 'open weights' rather than strictly OSI open source: the license adds restrictions (notably extra terms for products with over 700 million monthly active users).

Should I use Llama 3 or Llama 3.1?

For new projects, Llama 3.1 (released July 23, 2024) or a later point release is recommended — they add a 128K context window, more languages, and the frontier-scale 405B model. The original Llama 3 8B and 70B weights remain available but are superseded.