AI/TLDR

Command A+

Cohere's first Apache-2.0 frontier model: a 218B sparse MoE (25B active) for agentic, multilingual, and vision tasks that runs on as few as two H100s.

Overview

Command A+ is Cohere's flagship large language model, released on May 20, 2026 as the model ID command-a-plus-05-2026. It is the company's first Mixture-of-Experts model and the last entry in the Command A family, folding vision input, reasoning, translation, and agentic tool use into a single 218-billion-parameter network that activates only 25 billion parameters per token.

Notably, Command A+ is Cohere's first Command frontier model shipped under a full Apache 2.0 license with open weights. Cohere publishes near-lossless 16-bit (BF16), 8-bit (FP8), and 4-bit (W4A4) quantizations on the Hugging Face Hub, so the recommended W4A4 build runs on as few as two NVIDIA H100 GPUs or a single B200 — a footprint Cohere targets at sovereign, on-premise, and regulated enterprise deployments where data must stay inside trusted systems.

Beyond raw scores, Command A+ adds native citation grounding: when it pulls from an external tool or document it emits explicit grounding spans that tie each factual claim back to its source. It supports a 128K-token context window, up to 64K output tokens, and 48 languages (up from 23 in the original Command A), including every official EU language.

Released2026-05-20
LicenseApache-2.0
WeightsOpen weights
Parameters218B total / 25B active (sparse MoE)
Context128K
Max output64K tokens
ArchitectureDecoder-only sparse Mixture-of-Experts Transformer with 128 experts (8 active per token plus one shared expert) and interleaved sliding-window / global attention. Runs on as few as 2x NVIDIA H100 (W4A4) or 1x B200; FP8 needs 4x H100, BF16 needs 8x H100.
ModalitiesText, Vision
StatusAvailable

Benchmarks

Scatter chart plotting output tokens per second against the Artificial Analysis Intelligence Index, positioning Command A+ against other named models including gpt-oss-120b and Gemini 3.5 Flash.
Command A+ reaches ~281 output tokens/sec, with higher intelligence and speed than gpt-oss-120b but behind the Pareto frontier set by Gemini 3.5 Flash. — Artificial Analysis
AA-Omniscience knowledge-and-hallucination chart plotting Command A+ against comparable named models on accuracy versus non-hallucination.
On AA-Omniscience, Command A+ has relatively low accuracy but hallucinates the least, yielding a headline AA-Omniscience score of -4. — Artificial Analysis

This model's scores

  1. τ²-Bench Telecom (agentic)85%
  2. Terminal-Bench Hard (agentic coding)25%
  3. MMMU (multimodal understanding)75.1%
  4. MMMU-Pro (visual reasoning)63%
  5. MathVista80.6%
  6. CharXiv (reasoning)52.7%
  7. Artificial Analysis Intelligence Index29%

Scores on a 0–100 scale (25-point gridlines); higher is better. Each benchmark links to its published source.

Strengths

  • Open weights under a true Apache 2.0 license — commercial, government, and military use with no revenue caps or non-compete clauses
  • Efficient sparse MoE: 25B active parameters keep inference fast while 218B total parameters back the quality
  • Strongest agentic and tool-use model in the Command family, with large gains on multi-step enterprise task benchmarks
  • Native citation grounding links every claim to its source document for verifiable RAG
  • Small deployment footprint: near-lossless W4A4 quantization fits on 2x H100 or a single B200
  • Broad multilingual coverage across 48 languages, including all official EU languages

Best for

  • On-premise and sovereign enterprise agents that must keep sensitive data in-house
  • Retrieval-augmented generation where every answer needs verifiable, cited sources
  • Multi-step agentic workflows and tool use in customer support, operations, and analytics
  • Multilingual document understanding and translation across 48 languages
  • Multimodal document analysis combining text and image (vision) inputs

How to access

ProviderModel ID
Cohere ↗command-a-plus-05-2026

FAQ

Is Command A+ open source?

Yes. Command A+ is released with open weights under a full Apache 2.0 license, allowing commercial, government, and military use with no revenue caps or non-compete clauses. Weights are published on the Hugging Face Hub under the CohereLabs organization in BF16, FP8, and W4A4 quantizations.

What hardware does Command A+ need to run?

The recommended near-lossless W4A4 (4-bit) build runs on as few as two NVIDIA H100 GPUs or a single B200. The FP8 build needs about four H100s and the full BF16 build needs about eight H100s.

How big is Command A+ and what context length does it support?

It is a sparse Mixture-of-Experts model with 218 billion total parameters and 25 billion active per token (128 experts, 8 active plus one shared). It supports a 128K-token context window and up to 64K output tokens.

What makes Command A+ good for enterprise agents?

It is the strongest agentic and tool-use model in the Command family, with large gains on multi-step task benchmarks like τ²-Bench Telecom (85%), plus native citation grounding that links every claim to its source — useful for verifiable RAG. Its small deployment footprint suits sovereign, on-premise use where data must stay in-house.