Command R7B (12-2024)

Name: Command R7B (12-2024)
Author: Cohere

Cohere's smallest, fastest R-series model — a 7B open-weights LLM with 128K context, tuned for RAG, tool use, and agents on commodity hardware.

Overview

Command R7B (12-2024) is the smallest, fastest, and final model in Cohere's R series of enterprise large language models, announced on December 13, 2024. At 7 billion parameters with a 128K-token context window, it is designed for high-throughput, latency-sensitive deployments and is compact enough to run on commodity GPUs, edge devices, and even CPUs and MacBooks.

The model is purpose-built for the workloads Cohere targets in production: retrieval-augmented generation (RAG), tool use, and multistep ReAct-style agents that require complex reasoning and active information seeking. It also handles summarization, question answering, and code, and supports 23 languages including English, French, Spanish, German, Japanese, Korean, Arabic, Chinese, and Hindi.

Command R7B ships as open weights on Hugging Face under a non-commercial CC-BY-NC license (the CohereLabs/c4ai-command-r7b-12-2024 release) for research, and is available for commercial use through the Cohere API as model `command-r7b-12-2024`. On the Hugging Face Open LLM Leaderboard it posts an average of 31.4, leading with strong IFEval instruction-following while keeping per-token costs among the lowest of any production model.

Released	2024-12-13
License	CC-BY-NC 4.0 (open weights, non-commercial; commercial use via Cohere API)
Weights	Open weights
Parameters	7B
Context	128K
Max output	4K
Architecture	Auto-regressive optimized transformer. Three sliding-window-attention layers (4096-token window) with RoPE positional encoding, plus a fourth layer with global attention across the full sequence; no positional embeddings on the global-attention layer.
Modalities	Text
Status	Available

Benchmarks

Scores on a 0–100 scale (25-point gridlines); higher is better. Each benchmark links to its published source.

Pricing

Input	$0.0375 / 1M tokens per 1M tokens
Output	$0.15 / 1M tokens per 1M tokens

Pricing source ↗

Strengths

Smallest and fastest model in Cohere's R series — runs on commodity GPUs, edge devices, and CPUs
Strong instruction following (77.9 on IFEval) and competitive RAG/tool-use performance for its size
128K-token context window despite a 7B footprint
Multilingual across 23 languages
Built-in support for RAG with grounded citations, tool/function calling, and multistep ReAct agents
Very low API pricing ($0.0375 input / $0.15 output per 1M tokens)
Open weights on Hugging Face for research and self-hosting

Best for

High-volume, latency-sensitive RAG over long documents and knowledge bases
Tool-using and ReAct-style agents in dynamic, real-world environments
On-device and edge inference where a compact model is required
Multilingual question answering and summarization across 23 languages
Cost-sensitive classification, extraction, and simple Q&A at scale
Financial and numerical information extraction in conversational settings

How to access

Provider	Model ID
Cohere ↗	`command-r7b-12-2024`
OpenRouter ↗	`cohere/command-r7b-12-2024`

FAQ

How many parameters does Command R7B have?

Command R7B (12-2024) is a 7-billion-parameter model — the smallest in Cohere's Command R series. Its compact size lets it run on commodity GPUs, edge devices, and even CPUs and MacBooks.

What is Command R7B's context window?

Command R7B supports a 128K-token context window and a maximum of 4K output tokens, making it suitable for long-document RAG despite its small size.

Is Command R7B open source?

The weights are openly released on Hugging Face (CohereLabs/c4ai-command-r7b-12-2024) under a non-commercial CC-BY-NC 4.0 license for research. Commercial use is available through the Cohere API as model `command-r7b-12-2024`.

How much does Command R7B cost to use?

Via API it is priced at roughly $0.0375 per 1M input tokens and $0.15 per 1M output tokens — among the lowest rates of any production LLM, which suits high-volume RAG, classification, and agent workloads.

// Overview

// Benchmarks

// Pricing

// Strengths

// Best for

// How to access

// FAQ