Sam Witteveen · 2026-07-05 · notable
Sam Witteveen: 'MiniCPM5 - The 1B Cognitive Core?'
Sam Witteveen tests MiniCPM5-1B, OpenBMB's on-device model that Artificial Analysis crowned the leading 1B open-weight LLM. He walks through the hybrid-reasoning template and its agentic tool use versus larger 2B rivals.

Sam Witteveen puts MiniCPM5-1B — OpenBMB's SOTA 1B on-device model — through hybrid-reasoning and agentic-tool tests.
What is it?
Sam Witteveen's new video profiles MiniCPM5-1B, a 1.08B-parameter dense Transformer that OpenBMB released in May 2026 and Artificial Analysis rates the leading 1B open-weight LLM. The video walks through what "cognitive core" means for a phone-sized model with a 131K context and grouped-query attention.
How does it work?
Witteveen exercises MiniCPM5-1B's hybrid-reasoning template — the same checkpoint can answer fast with `<think>` off or switch into deliberate chain-of-thought via `enable_thinking`. He runs it locally, compares it against Qwen3.5 2B on tool-use prompts, and shows the deploy and fine-tune Agent Skills that ship alongside the weights.
Why does it matter?
MiniCPM5-1B scores 17.9 on Artificial Analysis's Intelligence Index — nearly 2 points ahead of the best 2B reasoning model — which makes it the first credibly on-device open LLM to beat larger rivals at agentic work. Witteveen's walk-through helps builders decide whether to move latency-sensitive agents off cloud APIs onto phones and laptops.
Who is it for?
on-device app developers, edge-AI engineers, small-model researchers
Try it
huggingface.co/openbmb/MiniCPM5-1B