Overview
Command R (08-2024) is Cohere's refreshed mid-size large language model, released on 30 August 2024 alongside the larger Command R+ (08-2024). It is a 32-billion-parameter model with a 128K-token context window, purpose-built for retrieval-augmented generation (RAG), tool use (function calling), and multilingual enterprise applications rather than for chasing raw chatbot leaderboard scores. On the Cohere API it is served under the model ID command-r-08-2024.
Compared with the original March 2024 Command R, this version is noticeably better at math, code, and reasoning and, per Cohere, is competitive with the previous-generation Command R+ while being far cheaper to run. It delivers around 50% higher throughput and 20% lower latency than its predecessor and cuts the hardware needed to serve it roughly in half. It also improves decision-making around when to call a tool, sharpens multilingual RAG (answering in the user's language), produces higher-quality citations, and can now decline questions it cannot answer.
Command R (08-2024) is trained on 23 languages and evaluated in 10 (English, French, Spanish, Italian, German, Brazilian Portuguese, Japanese, Korean, Simplified Chinese, and Arabic). Open weights are published under a CC-BY-NC license on Hugging Face for research, and the release also introduced Safety Modes (beta) with configurable STRICT and CONTEXTUAL behavior. It is text-only for both input and output.
| Released | 2024-08-30 |
|---|---|
| License | CC-BY-NC 4.0 (open weights, non-commercial) plus Cohere's Acceptable Use Policy; commercial use available via the Cohere API under separate terms |
| Weights | Open weights |
| Parameters | 32B |
| Context | 128K |
| Max output | 4K |
| Architecture | Auto-regressive transformer using grouped-query attention (GQA) to speed up inference, post-trained with supervised fine-tuning (SFT) and preference training. The August 2024 refresh delivers roughly 50% higher throughput and 20% lower latency than the original Command R while halving the hardware footprint needed to serve it. |
| Modalities | Text |
| Status | Available |
Benchmarks
- GPQA Diamond26.77%
Scores on a 0–100 scale (25-point gridlines); higher is better. Each benchmark links to its published source.
Pricing
| Input | $0.15 / 1M tokens per 1M tokens |
|---|---|
| Output | $0.60 / 1M tokens per 1M tokens |
Cohere API on-demand pricing for command-r-08-2024 as stated in Cohere's August 2024 release changelog. On-demand responses are capped at 4K output tokens; dedicated deployments lift the output cap.
Strengths
- Strong retrieval-augmented generation with grounded, citation-backed answers and an optional citation toggle
- Reliable single-step and multi-step tool use (function calling) for agentic workflows
- Multilingual coverage: trained on 23 languages, with RAG that responds in the user's language
- Large 128K-token context window for long documents and conversation history
- Efficient to serve: ~50% higher throughput and ~20% lower latency than the original Command R, at roughly half the hardware footprint
- Low API pricing ($0.15 / $0.60 per million tokens) makes high-volume RAG and tool-use pipelines affordable
- Open weights on Hugging Face enable research, fine-tuning, and self-hosting
Best for
- Retrieval-augmented generation over enterprise knowledge bases and document sets
- Agents and assistants that call tools / APIs (single-step and multi-step)
- Multilingual question answering and summarization across 10+ languages
- Long-document analysis using the 128K context window
- Structured data extraction and analysis
- High-volume, cost-sensitive production workloads where Command R+ would be overkill
How to access
| Provider | Model ID |
|---|---|
| Cohere ↗ | command-r-08-2024 |
Command R — every version
The full lineage of the Command R line, newest first. Every version has its own page — click any to compare specs, benchmarks and pricing.
| Version | Released | Context | License |
|---|---|---|---|
| Command R (08-2024)current | 2024-08-30 | — | Proprietary |
| Command R (03-2024) | 2024-03-11 | — | Proprietary |
FAQ
What is Command R (08-2024)?
It is Cohere's refreshed mid-size large language model, released on 30 August 2024. It has 32 billion parameters and a 128K-token context window, and is optimized for retrieval-augmented generation (RAG), tool use, and multilingual enterprise tasks. On the Cohere API it is the model command-r-08-2024.
How is it different from the original Command R (03-2024)?
The 08-2024 version is better at math, code, and reasoning, makes smarter tool-use decisions, improves multilingual RAG and citation quality, and can decline unanswerable questions. It also runs about 50% faster in throughput with roughly 20% lower latency and about half the hardware footprint of the original.
Is Command R (08-2024) open source?
The weights are openly published on Hugging Face under a CC-BY-NC license (non-commercial) together with Cohere's Acceptable Use Policy, so you can download them for research and self-hosting. Commercial use runs through the Cohere API under separate commercial terms.
How much does Command R (08-2024) cost on the API?
Per Cohere's August 2024 release notes, command-r-08-2024 is priced at $0.15 per million input tokens and $0.60 per million output tokens, which is substantially cheaper than the larger Command R+ (08-2024).
