Codestral Mamba

Name: Codestral Mamba
Author: Mistral AI

Mistral's open 7B code model built on the Mamba2 architecture for fast, linear-time inference over long code contexts.

Overview

Codestral Mamba is an open 7-billion-parameter code model from Mistral AI, released on July 16, 2024, alongside the math model Mathstral as part of Mistral's specialist open-model line. Unlike Mistral's Transformer-based Codestral 22B, Codestral Mamba is built on the Mamba2 architecture, a state-space model that offers linear-time inference and the theoretical ability to handle sequences of unbounded length. Mistral tested it for in-context retrieval up to 256k tokens, making it well suited to running through long files and large codebases without the quadratic slowdown a Transformer hits as the prompt grows.

The model has exactly 7,285,403,648 parameters and was released under the permissive Apache 2.0 license, free to use, modify, and self-host. Mistral published it both on its la Plateforme API under the name open-codestral-mamba (also testable as codestral-mamba-2407) and as open weights on Hugging Face at mistralai/Mamba-Codestral-7B-v0.1. It can be deployed with Mistral's mistral-inference SDK or NVIDIA's TensorRT-LLM, and it was positioned as a lightweight local code assistant alternative to the larger Codestral 22B.

Mistral retired the open-codestral-mamba endpoint from la Plateforme on June 6, 2025, pointing API users to the newer Codestral model. The open weights, however, remain available under Apache 2.0, so the model can still be downloaded, fine-tuned, and self-hosted indefinitely. It is best understood today as a historical, research-oriented release that demonstrated Mamba-style architectures could match Transformer code models on standard benchmarks at the 7B scale.

Released	2024-07-16
License	Apache 2.0
Weights	Open weights
Parameters	7,285,403,648 (~7.3B)
Context	256K tokens (tested for in-context retrieval up to 256k tokens)
Architecture	Mamba2 state-space model (linear-time inference, not a Transformer)
Modalities	text
Status	API retired on Mistral's la Plateforme on 2025-06-06 (replaced by Codestral); open weights remain freely available under Apache 2.0.

Benchmarks

Scores on a 0–100 scale (25-point gridlines); higher is better. Each benchmark links to its published source.

Strengths

Mamba2 architecture gives linear-time inference, so throughput stays steady even as the input grows long
Tested for in-context retrieval up to 256k tokens — useful for reasoning over large files and codebases
Strong code benchmarks for a 7B model (75.0% HumanEval Python), competitive with much larger Transformer code models
Apache 2.0 open weights: free to self-host, fine-tune, and redistribute with no usage restrictions
Small enough (~7.3B params) to run locally as a code-completion assistant

Best for

Local, low-latency code autocompletion and inline coding assistance
Generating and explaining code where a long context (large files, many imports) must stay in the prompt
Research and experimentation with non-Transformer (state-space) architectures for code
Self-hosted or air-gapped coding tools that need a permissively licensed model
Fine-tuning a compact open code model for a specific language or codebase

How to access

Provider	Model ID
Mistral AI (la Plateforme) ↗	`open-codestral-mamba`
Hugging Face (open weights) ↗	`mistralai/Mamba-Codestral-7B-v0.1`

Specialist open models (Mathstral / Codestral Mamba) — every version

The full lineage of the Specialist open models (Mathstral / Codestral Mamba) line, newest first. Every version has its own page — click any to compare specs, benchmarks and pricing.

Version	Released	Context	License
Mathstral 7Bcurrent	2024-07-16	—	Apache-2.0
Codestral Mamba	2024-07-16	—	Open weights

FAQ

What architecture does Codestral Mamba use?

It uses the Mamba2 state-space architecture rather than a Transformer. Mamba models run in linear time with respect to sequence length and can, in theory, model arbitrarily long sequences, which is why Codestral Mamba can serve fast responses regardless of how long the input is.

How large is Codestral Mamba and what context length does it support?

It has 7,285,403,648 parameters (about 7.3B) and was tested for in-context retrieval up to 256k tokens, so it can keep large files and codebases in context.

Is Codestral Mamba free and open source?

Yes. It is released under the Apache 2.0 license, so the weights are free to download, modify, self-host, and redistribute. They are published on Hugging Face as mistralai/Mamba-Codestral-7B-v0.1.

Is Codestral Mamba still available?

Mistral retired the open-codestral-mamba endpoint from its la Plateforme API on June 6, 2025, recommending the newer Codestral model instead. The open weights remain available under Apache 2.0, so you can still download, fine-tune, and self-host the model.

// Overview

// Benchmarks

// Strengths

// Best for

// How to access

// Specialist open models (Mathstral / Codestral Mamba) — every version

// FAQ