Overview
Command R+ (08-2024) is the August 2024 refresh of Cohere's flagship Command R+ model, a 104-billion-parameter dense transformer built for enterprise retrieval-augmented generation (RAG) and multi-step tool use. In the Cohere API it is addressed as command-r-plus-08-2024. It keeps the same 104B size and 128K-token context window as the original April 2024 release, but moves to Grouped Query Attention (GQA) and an updated training recipe.
Compared to the original Command R+, Cohere reports the 08-2024 version delivers roughly 50% higher throughput and 25% lower latency while keeping the same hardware footprint. The refresh also improves decision-making about when to use a tool, instruction-following from the system message, multilingual RAG in the user's own language, structured-data handling, and robustness to non-semantic prompt changes (whitespace, newlines). The models can now decline unanswerable questions and run RAG workflows without citations. A beta Safety Modes feature (STRICT / CONTEXTUAL) was introduced alongside this release.
The model is optimized for 10 languages (English, French, Spanish, Italian, German, Brazilian Portuguese, Japanese, Korean, Simplified Chinese, and Arabic) with pre-training data spanning 23 languages in total. It is served through Cohere's platform and cloud partners, and the weights are also published for research on Hugging Face under a CC-BY-NC license plus Cohere's Acceptable Use Policy.
| Released | 2024-08-30 |
|---|---|
| License | Proprietary (API); open weights under CC-BY-NC on Hugging Face |
| Weights | Open weights |
| Parameters | 104B (dense) |
| Context | 128K |
| Max output | 4K tokens |
| Architecture | Dense autoregressive transformer with Grouped Query Attention (GQA) |
| Knowledge cutoff | February 2023 |
| Modalities | Text |
| Status | Generally available |
Pricing
| Input | $2.50 / 1M tokens |
|---|---|
| Output | $10.00 / 1M tokens |
Strengths
- ~50% higher throughput and ~25% lower latency than the original Command R+ at the same hardware footprint
- Strong multi-step tool use and agentic workflows with improved tool-selection decisions
- Retrieval-augmented generation with inline citations (and optional citation-free RAG)
- Long 128K-token context window for long-document and multi-turn enterprise tasks
- Multilingual: optimized for 10 languages, pre-trained on 23
- Lower pricing than the 04-2024 release ($2.50 / $10.00 per million tokens)
Best for
- Enterprise retrieval-augmented generation over private knowledge bases
- Multi-step agents that call tools and APIs to complete tasks
- Multilingual customer support and document Q&A
- Structured-data extraction and analysis
- Long-context summarization and analysis of large documents
How to access
| Provider | Model ID |
|---|---|
| Cohere ↗ | command-r-plus-08-2024 |
| Hugging Face (open weights, research) ↗ | CohereLabs/c4ai-command-r-plus-08-2024 |
Command R+ — every version
The full lineage of the Command R+ line, newest first. Every version has its own page — click any to compare specs, benchmarks and pricing.
| Version | Released | Context | License |
|---|---|---|---|
| Command R+ (08-2024)current | 2024-08-30 | 128K | Proprietary (API); open weights under CC-BY-NC on Hugging Face |
| Command R+ (04-2024) | 2024-04-04 | 128K | Proprietary (API); open weights under CC-BY-NC on Hugging Face |
FAQ
How is Command R+ (08-2024) different from the original Command R+?
It is the August 2024 refresh of the same 104B model with the same 128K context window. Cohere reports roughly 50% higher throughput and 25% lower latency at the same hardware footprint, plus better tool-use decisions, stronger instruction-following from the system message, improved multilingual RAG, better structured-data handling, and the ability to decline unanswerable questions. It also moved to Grouped Query Attention (GQA) and dropped in price to $2.50 / $10.00 per million tokens.
What does Command R+ (08-2024) cost?
On Cohere's platform it is priced at $2.50 per million input tokens and $10.00 per million output tokens, down from the $3.00 / $15.00 of the original 04-2024 release, per Cohere's pricing page.
Is Command R+ (08-2024) open weights?
The production model is served through Cohere's API, but the weights are also published for research on Hugging Face (CohereLabs/c4ai-command-r-plus-08-2024) under a CC-BY-NC license together with Cohere's Acceptable Use Policy. CC-BY-NC is non-commercial, so it is not a permissive open-source license like Apache-2.0.
What is the context window and max output of Command R+ (08-2024)?
It supports a 128K-token context window and up to 4K output tokens per response, the same limits as the original Command R+, per Cohere's models documentation.
