AI/TLDR

Command R+ (08-2024)

Cohere's refreshed 104B enterprise RAG and tool-use flagship — ~50% higher throughput, 25% lower latency, and lower pricing.

Overview

Command R+ (08-2024) is the August 2024 refresh of Cohere's flagship Command R+ model, a 104-billion-parameter dense transformer built for enterprise retrieval-augmented generation (RAG) and multi-step tool use. In the Cohere API it is addressed as command-r-plus-08-2024. It keeps the same 104B size and 128K-token context window as the original April 2024 release, but moves to Grouped Query Attention (GQA) and an updated training recipe.

Compared to the original Command R+, Cohere reports the 08-2024 version delivers roughly 50% higher throughput and 25% lower latency while keeping the same hardware footprint. The refresh also improves decision-making about when to use a tool, instruction-following from the system message, multilingual RAG in the user's own language, structured-data handling, and robustness to non-semantic prompt changes (whitespace, newlines). The models can now decline unanswerable questions and run RAG workflows without citations. A beta Safety Modes feature (STRICT / CONTEXTUAL) was introduced alongside this release.

The model is optimized for 10 languages (English, French, Spanish, Italian, German, Brazilian Portuguese, Japanese, Korean, Simplified Chinese, and Arabic) with pre-training data spanning 23 languages in total. It is served through Cohere's platform and cloud partners, and the weights are also published for research on Hugging Face under a CC-BY-NC license plus Cohere's Acceptable Use Policy.

Released2024-08-30
LicenseProprietary (API); open weights under CC-BY-NC on Hugging Face
WeightsOpen weights
Parameters104B (dense)
Context128K
Max output4K tokens
ArchitectureDense autoregressive transformer with Grouped Query Attention (GQA)
Knowledge cutoffFebruary 2023
ModalitiesText
StatusGenerally available

Pricing

Input$2.50 / 1M tokens
Output$10.00 / 1M tokens

Pricing source ↗

Strengths

  • ~50% higher throughput and ~25% lower latency than the original Command R+ at the same hardware footprint
  • Strong multi-step tool use and agentic workflows with improved tool-selection decisions
  • Retrieval-augmented generation with inline citations (and optional citation-free RAG)
  • Long 128K-token context window for long-document and multi-turn enterprise tasks
  • Multilingual: optimized for 10 languages, pre-trained on 23
  • Lower pricing than the 04-2024 release ($2.50 / $10.00 per million tokens)

Best for

  • Enterprise retrieval-augmented generation over private knowledge bases
  • Multi-step agents that call tools and APIs to complete tasks
  • Multilingual customer support and document Q&A
  • Structured-data extraction and analysis
  • Long-context summarization and analysis of large documents

How to access

ProviderModel ID
Cohere ↗command-r-plus-08-2024
Hugging Face (open weights, research) ↗CohereLabs/c4ai-command-r-plus-08-2024

Command R+ — every version

The full lineage of the Command R+ line, newest first. Every version has its own page — click any to compare specs, benchmarks and pricing.

VersionReleasedContextLicense
Command R+ (08-2024)current2024-08-30128KProprietary (API); open weights under CC-BY-NC on Hugging Face
Command R+ (04-2024)2024-04-04128KProprietary (API); open weights under CC-BY-NC on Hugging Face

FAQ

How is Command R+ (08-2024) different from the original Command R+?

It is the August 2024 refresh of the same 104B model with the same 128K context window. Cohere reports roughly 50% higher throughput and 25% lower latency at the same hardware footprint, plus better tool-use decisions, stronger instruction-following from the system message, improved multilingual RAG, better structured-data handling, and the ability to decline unanswerable questions. It also moved to Grouped Query Attention (GQA) and dropped in price to $2.50 / $10.00 per million tokens.

What does Command R+ (08-2024) cost?

On Cohere's platform it is priced at $2.50 per million input tokens and $10.00 per million output tokens, down from the $3.00 / $15.00 of the original 04-2024 release, per Cohere's pricing page.

Is Command R+ (08-2024) open weights?

The production model is served through Cohere's API, but the weights are also published for research on Hugging Face (CohereLabs/c4ai-command-r-plus-08-2024) under a CC-BY-NC license together with Cohere's Acceptable Use Policy. CC-BY-NC is non-commercial, so it is not a permissive open-source license like Apache-2.0.

What is the context window and max output of Command R+ (08-2024)?

It supports a 128K-token context window and up to 4K output tokens per response, the same limits as the original Command R+, per Cohere's models documentation.