Command R (08-2024)

Name: Command R (08-2024)
Author: Cohere

Cohere's efficient 32B model built for RAG, tool use, and multilingual enterprise workloads.

Overview

Command R (08-2024) is Cohere's refreshed mid-size large language model, released on 30 August 2024 alongside the larger Command R+ (08-2024). It is a 32-billion-parameter model with a 128K-token context window, purpose-built for retrieval-augmented generation (RAG), tool use (function calling), and multilingual enterprise applications rather than for chasing raw chatbot leaderboard scores. On the Cohere API it is served under the model ID command-r-08-2024.

Compared with the original March 2024 Command R, this version is noticeably better at math, code, and reasoning and, per Cohere, is competitive with the previous-generation Command R+ while being far cheaper to run. It delivers around 50% higher throughput and 20% lower latency than its predecessor and cuts the hardware needed to serve it roughly in half. It also improves decision-making around when to call a tool, sharpens multilingual RAG (answering in the user's language), produces higher-quality citations, and can now decline questions it cannot answer.

Command R (08-2024) is trained on 23 languages and evaluated in 10 (English, French, Spanish, Italian, German, Brazilian Portuguese, Japanese, Korean, Simplified Chinese, and Arabic). Open weights are published under a CC-BY-NC license on Hugging Face for research, and the release also introduced Safety Modes (beta) with configurable STRICT and CONTEXTUAL behavior. It is text-only for both input and output.

Released	2024-08-30
License	CC-BY-NC 4.0 (open weights, non-commercial) plus Cohere's Acceptable Use Policy; commercial use available via the Cohere API under separate terms
Weights	Open weights
Parameters	32B
Context	128K
Max output	4K
Architecture	Auto-regressive transformer using grouped-query attention (GQA) to speed up inference, post-trained with supervised fine-tuning (SFT) and preference training. The August 2024 refresh delivers roughly 50% higher throughput and 20% lower latency than the original Command R while halving the hardware footprint needed to serve it.
Modalities	Text
Status	Available

Benchmarks

GPQA Diamond26.77%

Scores on a 0–100 scale (25-point gridlines); higher is better. Each benchmark links to its published source.

Pricing

Input	$0.15 / 1M tokens per 1M tokens
Output	$0.60 / 1M tokens per 1M tokens

Cohere API on-demand pricing for command-r-08-2024 as stated in Cohere's August 2024 release changelog. On-demand responses are capped at 4K output tokens; dedicated deployments lift the output cap.

Pricing source ↗

Strengths

Strong retrieval-augmented generation with grounded, citation-backed answers and an optional citation toggle
Reliable single-step and multi-step tool use (function calling) for agentic workflows
Multilingual coverage: trained on 23 languages, with RAG that responds in the user's language
Large 128K-token context window for long documents and conversation history
Efficient to serve: ~50% higher throughput and ~20% lower latency than the original Command R, at roughly half the hardware footprint
Low API pricing ($0.15 / $0.60 per million tokens) makes high-volume RAG and tool-use pipelines affordable
Open weights on Hugging Face enable research, fine-tuning, and self-hosting

Best for

Retrieval-augmented generation over enterprise knowledge bases and document sets
Agents and assistants that call tools / APIs (single-step and multi-step)
Multilingual question answering and summarization across 10+ languages
Long-document analysis using the 128K context window
Structured data extraction and analysis
High-volume, cost-sensitive production workloads where Command R+ would be overkill

How to access

Provider	Model ID
Cohere ↗	`command-r-08-2024`

Command R — every version

The full lineage of the Command R line, newest first. Every version has its own page — click any to compare specs, benchmarks and pricing.

Version	Released	Context	License
Command R (08-2024)current	2024-08-30	—	Proprietary
Command R (03-2024)	2024-03-11	—	Proprietary

FAQ

What is Command R (08-2024)?

It is Cohere's refreshed mid-size large language model, released on 30 August 2024. It has 32 billion parameters and a 128K-token context window, and is optimized for retrieval-augmented generation (RAG), tool use, and multilingual enterprise tasks. On the Cohere API it is the model command-r-08-2024.

How is it different from the original Command R (03-2024)?

The 08-2024 version is better at math, code, and reasoning, makes smarter tool-use decisions, improves multilingual RAG and citation quality, and can decline unanswerable questions. It also runs about 50% faster in throughput with roughly 20% lower latency and about half the hardware footprint of the original.

Is Command R (08-2024) open source?

The weights are openly published on Hugging Face under a CC-BY-NC license (non-commercial) together with Cohere's Acceptable Use Policy, so you can download them for research and self-hosting. Commercial use runs through the Cohere API under separate commercial terms.

How much does Command R (08-2024) cost on the API?

Per Cohere's August 2024 release notes, command-r-08-2024 is priced at $0.15 per million input tokens and $0.60 per million output tokens, which is substantially cheaper than the larger Command R+ (08-2024).

// Overview

// Benchmarks

// Pricing

// Strengths

// Best for

// How to access

// Command R — every version

// FAQ