AI/TLDR

Command R (08-2024)

Cohere's efficient 32B model built for RAG, tool use, and multilingual enterprise workloads.

Overview

Command R (08-2024) is Cohere's refreshed mid-size large language model, released on 30 August 2024 alongside the larger Command R+ (08-2024). It is a 32-billion-parameter model with a 128K-token context window, purpose-built for retrieval-augmented generation (RAG), tool use (function calling), and multilingual enterprise applications rather than for chasing raw chatbot leaderboard scores. On the Cohere API it is served under the model ID command-r-08-2024.

Compared with the original March 2024 Command R, this version is noticeably better at math, code, and reasoning and, per Cohere, is competitive with the previous-generation Command R+ while being far cheaper to run. It delivers around 50% higher throughput and 20% lower latency than its predecessor and cuts the hardware needed to serve it roughly in half. It also improves decision-making around when to call a tool, sharpens multilingual RAG (answering in the user's language), produces higher-quality citations, and can now decline questions it cannot answer.

Command R (08-2024) is trained on 23 languages and evaluated in 10 (English, French, Spanish, Italian, German, Brazilian Portuguese, Japanese, Korean, Simplified Chinese, and Arabic). Open weights are published under a CC-BY-NC license on Hugging Face for research, and the release also introduced Safety Modes (beta) with configurable STRICT and CONTEXTUAL behavior. It is text-only for both input and output.

Released2024-08-30
LicenseCC-BY-NC 4.0 (open weights, non-commercial) plus Cohere's Acceptable Use Policy; commercial use available via the Cohere API under separate terms
WeightsOpen weights
Parameters32B
Context128K
Max output4K
ArchitectureAuto-regressive transformer using grouped-query attention (GQA) to speed up inference, post-trained with supervised fine-tuning (SFT) and preference training. The August 2024 refresh delivers roughly 50% higher throughput and 20% lower latency than the original Command R while halving the hardware footprint needed to serve it.
ModalitiesText
StatusAvailable

Benchmarks

  1. GPQA Diamond26.77%

Scores on a 0–100 scale (25-point gridlines); higher is better. Each benchmark links to its published source.

Pricing

Input$0.15 / 1M tokens per 1M tokens
Output$0.60 / 1M tokens per 1M tokens

Cohere API on-demand pricing for command-r-08-2024 as stated in Cohere's August 2024 release changelog. On-demand responses are capped at 4K output tokens; dedicated deployments lift the output cap.

Pricing source ↗

Strengths

  • Strong retrieval-augmented generation with grounded, citation-backed answers and an optional citation toggle
  • Reliable single-step and multi-step tool use (function calling) for agentic workflows
  • Multilingual coverage: trained on 23 languages, with RAG that responds in the user's language
  • Large 128K-token context window for long documents and conversation history
  • Efficient to serve: ~50% higher throughput and ~20% lower latency than the original Command R, at roughly half the hardware footprint
  • Low API pricing ($0.15 / $0.60 per million tokens) makes high-volume RAG and tool-use pipelines affordable
  • Open weights on Hugging Face enable research, fine-tuning, and self-hosting

Best for

  • Retrieval-augmented generation over enterprise knowledge bases and document sets
  • Agents and assistants that call tools / APIs (single-step and multi-step)
  • Multilingual question answering and summarization across 10+ languages
  • Long-document analysis using the 128K context window
  • Structured data extraction and analysis
  • High-volume, cost-sensitive production workloads where Command R+ would be overkill

How to access

ProviderModel ID
Cohere ↗command-r-08-2024

Command R — every version

The full lineage of the Command R line, newest first. Every version has its own page — click any to compare specs, benchmarks and pricing.

VersionReleasedContextLicense
Command R (08-2024)current2024-08-30Proprietary
Command R (03-2024)2024-03-11Proprietary

FAQ

What is Command R (08-2024)?

It is Cohere's refreshed mid-size large language model, released on 30 August 2024. It has 32 billion parameters and a 128K-token context window, and is optimized for retrieval-augmented generation (RAG), tool use, and multilingual enterprise tasks. On the Cohere API it is the model command-r-08-2024.

How is it different from the original Command R (03-2024)?

The 08-2024 version is better at math, code, and reasoning, makes smarter tool-use decisions, improves multilingual RAG and citation quality, and can decline unanswerable questions. It also runs about 50% faster in throughput with roughly 20% lower latency and about half the hardware footprint of the original.

Is Command R (08-2024) open source?

The weights are openly published on Hugging Face under a CC-BY-NC license (non-commercial) together with Cohere's Acceptable Use Policy, so you can download them for research and self-hosting. Commercial use runs through the Cohere API under separate commercial terms.

How much does Command R (08-2024) cost on the API?

Per Cohere's August 2024 release notes, command-r-08-2024 is priced at $0.15 per million input tokens and $0.60 per million output tokens, which is substantially cheaper than the larger Command R+ (08-2024).