Overview
Command A (model ID command-a-03-2025) is Cohere's flagship Command-line large language model, released on March 13, 2025. It is a 111-billion-parameter, text-only model built for real-world enterprise work — agents, retrieval-augmented generation (RAG), tool use, code, and multilingual tasks — and it ships with a 256,000-token context window for processing long documents.
Cohere's headline pitch for Command A is efficiency: it needs only two GPUs (A100s or H100s) to run and delivers roughly 150% higher throughput than its predecessor, Command R+ (08-2024). Cohere positions it as on par with or better than GPT-4o and DeepSeek-V3 on agentic enterprise tasks while being significantly cheaper to serve.
Command A supports 23 languages (including English, French, Spanish, German, Japanese, Korean, Chinese, Arabic, Russian, and Hindi) and is open-weights — the weights are published on Hugging Face under a CC-BY-NC license, with a hosted version available through Cohere's API as command-a-03-2025. A 55-page technical report, 'Command A: An Enterprise-Ready Large Language Model' (arXiv 2504.00698), documents its hybrid architecture and evaluations.
| Released | 2025-03-13 |
|---|---|
| License | CC-BY-NC (Creative Commons Attribution-NonCommercial), plus Cohere Lab's Acceptable Use Policy |
| Weights | Open weights |
| Parameters | 111B |
| Context | 256K |
| Max output | 4K (on-demand); up to 256K in dedicated mode |
| Architecture | Auto-regressive transformer with a hybrid attention design: three layers use sliding-window attention (window size 4096) with RoPE, and a fourth layer uses global attention without positional embeddings. Pretraining is followed by supervised fine-tuning (SFT) and preference training to align for helpfulness and safety. |
| Modalities | Text |
| Status | Available |
Benchmarks
- GPQA Diamond50.51%
Scores on a 0–100 scale (25-point gridlines); higher is better. Each benchmark links to its published source.
Pricing
| Input | $2.50 / 1M tokens per 1M tokens |
|---|---|
| Output | $10.00 / 1M tokens per 1M tokens |
Hosted Cohere API pricing for command-a-03-2025; trial API keys are free, production keys are pay-as-you-go. Self-hosting the open weights has no per-token fee.
Strengths
- Strong agentic and tool-use performance aimed at enterprise REACT-style agents with multistep tool calls
- Built-in RAG with grounding and citations for conversational, document-backed answers
- 256K-token context window for long enterprise documents
- High efficiency — runs on just two A100/H100 GPUs with ~150% higher throughput than Command R+ (08-2024)
- Multilingual coverage across 23 languages
- Open weights available on Hugging Face for private/self-hosted deployment
Best for
- Enterprise AI agents that call tools and APIs over multiple steps
- Retrieval-augmented generation over internal knowledge bases with citations
- Long-document analysis and summarization using the 256K context window
- Multilingual customer support and content generation across 23 languages
- Code generation and synthesis
- On-premise or private-cloud deployment where open weights and a low GPU footprint matter
How to access
| Provider | Model ID |
|---|---|
| Cohere ↗ | command-a-03-2025 |
| Oracle Cloud Infrastructure (OCI) Generative AI ↗ | cohere.command-a-03-2025 |
| OpenRouter ↗ | cohere/command-a |
| Hugging Face (open weights) ↗ | CohereLabs/c4ai-command-a-03-2025 |
FAQ
When was Command A (03-2025) released and who makes it?
Command A was released by Cohere on March 13, 2025. The model ID is command-a-03-2025, and it is the flagship of Cohere's Command line of enterprise large language models.
How large is Command A and what context window does it support?
Command A has 111 billion parameters and a 256,000-token (256K) context window, letting it process long enterprise documents in a single prompt.
Is Command A open weights, and what does it cost?
Yes — the weights are published on Hugging Face under a CC-BY-NC (non-commercial) license plus Cohere Lab's Acceptable Use Policy, so you can self-host. The hosted Cohere API is priced at about $2.50 per 1M input tokens and $10.00 per 1M output tokens.
What is Command A best at?
Cohere built Command A for enterprise tasks: agentic tool use, retrieval-augmented generation (RAG) with grounding and citations, code, and multilingual work across 23 languages. A key selling point is efficiency — it runs on just two A100/H100 GPUs with roughly 150% higher throughput than Command R+ (08-2024).
