Command A (03-2025)

Name: Command A (03-2025)
Author: Cohere

Cohere's enterprise-grade 111B model for agents, RAG, and multilingual work — running on just two GPUs.

Overview

Command A (model ID command-a-03-2025) is Cohere's flagship Command-line large language model, released on March 13, 2025. It is a 111-billion-parameter, text-only model built for real-world enterprise work — agents, retrieval-augmented generation (RAG), tool use, code, and multilingual tasks — and it ships with a 256,000-token context window for processing long documents.

Cohere's headline pitch for Command A is efficiency: it needs only two GPUs (A100s or H100s) to run and delivers roughly 150% higher throughput than its predecessor, Command R+ (08-2024). Cohere positions it as on par with or better than GPT-4o and DeepSeek-V3 on agentic enterprise tasks while being significantly cheaper to serve.

Command A supports 23 languages (including English, French, Spanish, German, Japanese, Korean, Chinese, Arabic, Russian, and Hindi) and is open-weights — the weights are published on Hugging Face under a CC-BY-NC license, with a hosted version available through Cohere's API as command-a-03-2025. A 55-page technical report, 'Command A: An Enterprise-Ready Large Language Model' (arXiv 2504.00698), documents its hybrid architecture and evaluations.

Released	2025-03-13
License	CC-BY-NC (Creative Commons Attribution-NonCommercial), plus Cohere Lab's Acceptable Use Policy
Weights	Open weights
Parameters	111B
Context	256K
Max output	4K (on-demand); up to 256K in dedicated mode
Architecture	Auto-regressive transformer with a hybrid attention design: three layers use sliding-window attention (window size 4096) with RoPE, and a fourth layer uses global attention without positional embeddings. Pretraining is followed by supervised fine-tuning (SFT) and preference training to align for helpfulness and safety.
Modalities	Text
Status	Available

Benchmarks

GPQA Diamond50.51%

Scores on a 0–100 scale (25-point gridlines); higher is better. Each benchmark links to its published source.

Pricing

Input	$2.50 / 1M tokens per 1M tokens
Output	$10.00 / 1M tokens per 1M tokens

Hosted Cohere API pricing for command-a-03-2025; trial API keys are free, production keys are pay-as-you-go. Self-hosting the open weights has no per-token fee.

Pricing source ↗

Strengths

Strong agentic and tool-use performance aimed at enterprise REACT-style agents with multistep tool calls
Built-in RAG with grounding and citations for conversational, document-backed answers
256K-token context window for long enterprise documents
High efficiency — runs on just two A100/H100 GPUs with ~150% higher throughput than Command R+ (08-2024)
Multilingual coverage across 23 languages
Open weights available on Hugging Face for private/self-hosted deployment

Best for

Enterprise AI agents that call tools and APIs over multiple steps
Retrieval-augmented generation over internal knowledge bases with citations
Long-document analysis and summarization using the 256K context window
Multilingual customer support and content generation across 23 languages
Code generation and synthesis
On-premise or private-cloud deployment where open weights and a low GPU footprint matter

How to access

Provider	Model ID
Cohere ↗	`command-a-03-2025`
Oracle Cloud Infrastructure (OCI) Generative AI ↗	`cohere.command-a-03-2025`
OpenRouter ↗	`cohere/command-a`
Hugging Face (open weights) ↗	`CohereLabs/c4ai-command-a-03-2025`

FAQ

When was Command A (03-2025) released and who makes it?

Command A was released by Cohere on March 13, 2025. The model ID is command-a-03-2025, and it is the flagship of Cohere's Command line of enterprise large language models.

How large is Command A and what context window does it support?

Command A has 111 billion parameters and a 256,000-token (256K) context window, letting it process long enterprise documents in a single prompt.

Is Command A open weights, and what does it cost?

Yes — the weights are published on Hugging Face under a CC-BY-NC (non-commercial) license plus Cohere Lab's Acceptable Use Policy, so you can self-host. The hosted Cohere API is priced at about $2.50 per 1M input tokens and $10.00 per 1M output tokens.

What is Command A best at?

Cohere built Command A for enterprise tasks: agentic tool use, retrieval-augmented generation (RAG) with grounding and citations, code, and multilingual work across 23 languages. A key selling point is efficiency — it runs on just two A100/H100 GPUs with roughly 150% higher throughput than Command R+ (08-2024).

// Overview

// Benchmarks

// Pricing

// Strengths

// Best for

// How to access

// FAQ