AI/TLDR

North Mini Code 1.0

Cohere's first North-family model: an open Apache-2.0 30B / 3B-active MoE built specifically for agentic coding, 256K context, runs on a single H100.

Overview

North Mini Code 1.0 is a 30-billion-parameter sparse Mixture-of-Experts model from Cohere Labs, released on June 9, 2026 as the first entry in a new "North" family of open coding models. It activates only 3 billion parameters per token through 128 experts (8 active), runs on a single NVIDIA H100 at FP8, and is distributed on the Hugging Face Hub under Apache 2.0 in BF16, FP8, and W4A16 quantizations.

Cohere positions the model as a coding sub-agent under an orchestrator rather than a general assistant — it is optimized for code generation, agentic software engineering, and terminal tasks, with tool-use and interleaved thinking support. Inputs and outputs are text-only, and the model supports a 256K-token context window with up to 64K output tokens.

Cohere reports 67.6% on SWE-Bench Verified, 40.2% on SWE-Bench Pro, and 36% on Terminal-Bench v2 in its model card, and 33.4 on the Artificial Analysis Coding Index in the launch blog. The company highlights ~2.8x higher output throughput and ~30% lower inter-token latency than Mistral's Devstral Small 2 at comparable accuracy.

Released2026-06-09
LicenseApache-2.0
WeightsOpen weights
Parameters30B total / 3B active (sparse MoE, 128 experts, 8 active per token)
Context256K
Max output64K tokens
ArchitectureDecoder-only sparse Mixture-of-Experts Transformer with 128 experts (8 active per token). Trained via two-stage cascaded supervised fine-tuning followed by reinforcement learning with verifiable rewards (RLVR). Distributed in BF16 alongside FP8 and W4A16 quantizations.
ModalitiesText
StatusGenerally available

Benchmarks

  1. SWE-Bench Verified (pass@1)67.6%
  2. SWE-Bench Pro40.2%
  3. Terminal-Bench v236%
  4. Artificial Analysis Coding Index33.4%

Scores on a 0–100 scale (25-point gridlines); higher is better. Each benchmark links to its published source.

Strengths

  • Open weights under a true Apache 2.0 license, with no usage caps or non-compete clauses
  • Sparse MoE keeps inference cheap — 3B active parameters of a 30B total network — and fits a single H100 at FP8
  • 256K-token context with 64K-token max output, suited to whole-repository agentic coding sessions
  • Built specifically for agentic software engineering: tool-use, interleaved thinking, SWE-agent and ReAct harnesses
  • Multiple distribution targets out of the box: BF16, FP8, W4A16, plus MLX builds for Apple silicon

Best for

  • Embedded as a coding sub-agent under an orchestrator handling planning and review
  • High-volume code execution on fixed hardware where per-token cost matters
  • Sovereign and on-premise deployments where source code must stay inside trusted systems
  • Local developer setups: a single H100 server or a workstation with MLX on Apple silicon
  • Terminal automation and multi-step agentic coding workflows

How to access

ProviderModel ID
Hugging Face (open weights) ↗CohereLabs/North-Mini-Code-1.0

FAQ

What is North Mini Code 1.0?

It is the first model in Cohere Labs' new "North" family — a 30B-parameter sparse Mixture-of-Experts model (3B active per token) released on June 9, 2026 under Apache 2.0. It is purpose-built as a coding sub-agent: it handles code generation, agentic software engineering, and terminal tasks with tool use and interleaved thinking.

What hardware does North Mini Code 1.0 need?

The FP8 build fits on a single NVIDIA H100. Cohere also publishes a W4A16 quantization for smaller deployments and an MLX-compatible build that runs locally on Apple silicon at roughly 20 GB of RAM.

What is the context window and what modalities does it support?

256K-token input context and up to 64K output tokens per generation. Inputs and outputs are text only — North Mini Code is not multimodal.

How does North Mini Code compare to Devstral Small 2?

On Cohere's launch comparison, North Mini Code reaches roughly 2.8x the output throughput and 30% lower inter-token latency than Devstral Small 2 at competitive coding accuracy, scoring 33.4 on the Artificial Analysis Coding Index.