Command A+

Name: Command A+
Author: Cohere

Cohere's first Apache-2.0 frontier model: a 218B sparse MoE (25B active) for agentic, multilingual, and vision tasks that runs on as few as two H100s.

Overview

Command A+ is Cohere's flagship large language model, released on May 20, 2026 as the model ID command-a-plus-05-2026. It is the company's first Mixture-of-Experts model and the last entry in the Command A family, folding vision input, reasoning, translation, and agentic tool use into a single 218-billion-parameter network that activates only 25 billion parameters per token.

Notably, Command A+ is Cohere's first Command frontier model shipped under a full Apache 2.0 license with open weights. Cohere publishes near-lossless 16-bit (BF16), 8-bit (FP8), and 4-bit (W4A4) quantizations on the Hugging Face Hub, so the recommended W4A4 build runs on as few as two NVIDIA H100 GPUs or a single B200 — a footprint Cohere targets at sovereign, on-premise, and regulated enterprise deployments where data must stay inside trusted systems.

Beyond raw scores, Command A+ adds native citation grounding: when it pulls from an external tool or document it emits explicit grounding spans that tie each factual claim back to its source. It supports a 128K-token context window, up to 64K output tokens, and 48 languages (up from 23 in the original Command A), including every official EU language.

Released	2026-05-20
License	Apache-2.0
Weights	Open weights
Parameters	218B total / 25B active (sparse MoE)
Context	128K
Max output	64K tokens
Architecture	Decoder-only sparse Mixture-of-Experts Transformer with 128 experts (8 active per token plus one shared expert) and interleaved sliding-window / global attention. Runs on as few as 2x NVIDIA H100 (W4A4) or 1x B200; FP8 needs 4x H100, BF16 needs 8x H100.
Modalities	Text, Vision
Status	Available

Benchmarks

Scatter chart plotting output tokens per second against the Artificial Analysis Intelligence Index, positioning Command A+ against other named models including gpt-oss-120b and Gemini 3.5 Flash. — Command A+ reaches ~281 output tokens/sec, with higher intelligence and speed than gpt-oss-120b but behind the Pareto frontier set by Gemini 3.5 Flash. — Artificial Analysis

AA-Omniscience knowledge-and-hallucination chart plotting Command A+ against comparable named models on accuracy versus non-hallucination. — On AA-Omniscience, Command A+ has relatively low accuracy but hallucinates the least, yielding a headline AA-Omniscience score of -4. — Artificial Analysis

This model's scores

Scores on a 0–100 scale (25-point gridlines); higher is better. Each benchmark links to its published source.

Strengths

Open weights under a true Apache 2.0 license — commercial, government, and military use with no revenue caps or non-compete clauses
Efficient sparse MoE: 25B active parameters keep inference fast while 218B total parameters back the quality
Strongest agentic and tool-use model in the Command family, with large gains on multi-step enterprise task benchmarks
Native citation grounding links every claim to its source document for verifiable RAG
Small deployment footprint: near-lossless W4A4 quantization fits on 2x H100 or a single B200
Broad multilingual coverage across 48 languages, including all official EU languages

Best for

On-premise and sovereign enterprise agents that must keep sensitive data in-house
Retrieval-augmented generation where every answer needs verifiable, cited sources
Multi-step agentic workflows and tool use in customer support, operations, and analytics
Multilingual document understanding and translation across 48 languages
Multimodal document analysis combining text and image (vision) inputs

How to access

Provider	Model ID
Cohere ↗	`command-a-plus-05-2026`

FAQ

Is Command A+ open source?

Yes. Command A+ is released with open weights under a full Apache 2.0 license, allowing commercial, government, and military use with no revenue caps or non-compete clauses. Weights are published on the Hugging Face Hub under the CohereLabs organization in BF16, FP8, and W4A4 quantizations.

What hardware does Command A+ need to run?

The recommended near-lossless W4A4 (4-bit) build runs on as few as two NVIDIA H100 GPUs or a single B200. The FP8 build needs about four H100s and the full BF16 build needs about eight H100s.

How big is Command A+ and what context length does it support?

It is a sparse Mixture-of-Experts model with 218 billion total parameters and 25 billion active per token (128 experts, 8 active plus one shared). It supports a 128K-token context window and up to 64K output tokens.

What makes Command A+ good for enterprise agents?

It is the strongest agentic and tool-use model in the Command family, with large gains on multi-step task benchmarks like τ²-Bench Telecom (85%), plus native citation grounding that links every claim to its source — useful for verifiable RAG. Its small deployment footprint suits sovereign, on-premise use where data must stay in-house.

// Overview

// Benchmarks

This model's scores

// Strengths

// Best for

// How to access

// FAQ

Overview

Benchmarks

Strengths

Best for

How to access

FAQ