North Mini Code 1.0

Name: North Mini Code 1.0
Author: Cohere

Cohere's first North-family model: an open Apache-2.0 30B / 3B-active MoE built specifically for agentic coding, 256K context, runs on a single H100.

Overview

North Mini Code 1.0 is a 30-billion-parameter sparse Mixture-of-Experts model from Cohere Labs, released on June 9, 2026 as the first entry in a new "North" family of open coding models. It activates only 3 billion parameters per token through 128 experts (8 active), runs on a single NVIDIA H100 at FP8, and is distributed on the Hugging Face Hub under Apache 2.0 in BF16, FP8, and W4A16 quantizations.

Cohere positions the model as a coding sub-agent under an orchestrator rather than a general assistant — it is optimized for code generation, agentic software engineering, and terminal tasks, with tool-use and interleaved thinking support. Inputs and outputs are text-only, and the model supports a 256K-token context window with up to 64K output tokens.

Cohere reports 67.6% on SWE-Bench Verified, 40.2% on SWE-Bench Pro, and 36% on Terminal-Bench v2 in its model card, and 33.4 on the Artificial Analysis Coding Index in the launch blog. The company highlights ~2.8x higher output throughput and ~30% lower inter-token latency than Mistral's Devstral Small 2 at comparable accuracy.

Released	2026-06-09
License	Apache-2.0
Weights	Open weights
Parameters	30B total / 3B active (sparse MoE, 128 experts, 8 active per token)
Context	256K
Max output	64K tokens
Architecture	Decoder-only sparse Mixture-of-Experts Transformer with 128 experts (8 active per token). Trained via two-stage cascaded supervised fine-tuning followed by reinforcement learning with verifiable rewards (RLVR). Distributed in BF16 alongside FP8 and W4A16 quantizations.
Modalities	Text
Status	Generally available

Benchmarks

Scores on a 0–100 scale (25-point gridlines); higher is better. Each benchmark links to its published source.

Strengths

Open weights under a true Apache 2.0 license, with no usage caps or non-compete clauses
Sparse MoE keeps inference cheap — 3B active parameters of a 30B total network — and fits a single H100 at FP8
256K-token context with 64K-token max output, suited to whole-repository agentic coding sessions
Built specifically for agentic software engineering: tool-use, interleaved thinking, SWE-agent and ReAct harnesses
Multiple distribution targets out of the box: BF16, FP8, W4A16, plus MLX builds for Apple silicon

Best for

Embedded as a coding sub-agent under an orchestrator handling planning and review
High-volume code execution on fixed hardware where per-token cost matters
Sovereign and on-premise deployments where source code must stay inside trusted systems
Local developer setups: a single H100 server or a workstation with MLX on Apple silicon
Terminal automation and multi-step agentic coding workflows

How to access

Provider	Model ID
Hugging Face (open weights) ↗	`CohereLabs/North-Mini-Code-1.0`

FAQ

What is North Mini Code 1.0?

It is the first model in Cohere Labs' new "North" family — a 30B-parameter sparse Mixture-of-Experts model (3B active per token) released on June 9, 2026 under Apache 2.0. It is purpose-built as a coding sub-agent: it handles code generation, agentic software engineering, and terminal tasks with tool use and interleaved thinking.

What hardware does North Mini Code 1.0 need?

The FP8 build fits on a single NVIDIA H100. Cohere also publishes a W4A16 quantization for smaller deployments and an MLX-compatible build that runs locally on Apple silicon at roughly 20 GB of RAM.

What is the context window and what modalities does it support?

256K-token input context and up to 64K output tokens per generation. Inputs and outputs are text only — North Mini Code is not multimodal.

How does North Mini Code compare to Devstral Small 2?

On Cohere's launch comparison, North Mini Code reaches roughly 2.8x the output throughput and 30% lower inter-token latency than Devstral Small 2 at competitive coding accuracy, scoring 33.4 on the Artificial Analysis Coding Index.

// Overview

// Benchmarks

// Strengths

// Best for

// How to access

// FAQ

Overview

Benchmarks

Strengths

Best for

How to access

FAQ