Weibo AI · 2026-06-16 · notable
VibeThinker-3B — Weibo's 3B reasoning model hits 80.2% on LiveCodeBench v6
VibeThinker-3B is a 3-billion-parameter dense reasoning model from Sina Weibo's AI lab that posts 94.3 on AIME26 and 80.2 Pass@1 on LiveCodeBench v6, with MIT-licensed weights on HuggingFace and code on GitHub.

Sina Weibo's 3B model finetuned from Qwen2.5-Coder-3B, MIT-licensed, scoring 94.3 on AIME26 and 80.2 on LiveCodeBench v6.
Key specs
| License | MIT |
|---|---|
| Parameters | 3B |
| GitHub stars | 903 |
| Aime26 | 94.3 |
| Aime25 | 91.4 |
| Hmmt25 | 89.3 |
| Bru mo25 | 93.8 |
| Imo answer bench | 76.4 |
| Live code bench v6 pass@1 | 80.2 |
| Leet code acceptance | 96.1% |
Quick facts
| Maker | Sina Weibo AI Lab |
|---|---|
| Parameters | 3B dense |
| Base model | Qwen2.5-3B (post-trained from Qwen2.5-Coder-3B) |
| License | MIT |
| Weights | HuggingFace, ModelScope |
| Paper | arXiv 2606.16140 |
Benchmarks
What is it?
VibeThinker-3B is a 3-billion-parameter dense reasoning model from Sina Weibo's AI Lab. The model is built on Qwen2.5-3B, post-trained from Qwen2.5-Coder-3B, and released under the MIT license with weights on HuggingFace and ModelScope. VibeThinker-3B targets verifiable reasoning tasks like math contests and competitive programming, not open chat.
How does it work?
The VibeThinker-3B training pipeline starts with supervised fine-tuning that adds new data synthesis, quality filtering, and curriculum learning. The Weibo team then extends MGPO-style reinforcement learning across multiple verifiable domains while keeping full long-context reasoning trajectories intact, and finishes with offline self-distillation plus instruction-tuning RL. They call the overall approach the Spectrum-to-Signal Principle.
Why does it matter?
VibeThinker-3B posts 94.3 on AIME26, 80.2 Pass@1 on LiveCodeBench v6, and a 96.1% acceptance rate on unseen LeetCode contests. Researchers are openly arguing over whether contest benchmarks are now too leaky to trust at this size, but either way VibeThinker-3B is a small, runnable artifact and a useful test-bed for the small-model reasoning argument playing out across labs.
Who is it for?
ML researchers, small-model practitioners
Frequently asked questions
- What is VibeThinker-3B?
- VibeThinker-3B is a 3-billion-parameter dense reasoning model from Sina Weibo's AI Lab, finetuned from Qwen2.5-Coder-3B. VibeThinker-3B is released under the MIT license with weights on HuggingFace and ModelScope and code on GitHub. The model is designed for verifiable reasoning tasks like math contests and competitive programming.
- How does VibeThinker-3B score on AIME and LiveCodeBench?
- VibeThinker-3B posts 94.3 on AIME26 and 91.4 on AIME25 (math contest), plus 80.2 Pass@1 on LiveCodeBench v6 and a 96.1% acceptance rate on recent unseen LeetCode contests. On IMO-AnswerBench VibeThinker-3B scores 76.4. The Weibo team reports it is competitive with much larger frontier models on these specific benchmarks.
- What training recipe does VibeThinker-3B use?
- The VibeThinker-3B training pipeline strengthens data synthesis, quality filtering, and curriculum learning in supervised fine-tuning, then extends MGPO-style reinforcement learning to multiple verifiable domains while preserving long-context reasoning trajectories. VibeThinker-3B finishes with offline self-distillation and instruction-tuning RL. The Weibo team calls this approach the Spectrum-to-Signal Principle.
- Is VibeThinker-3B really competitive with frontier models?
- VibeThinker-3B's wins are on verifiable math contests and competitive programming evaluations, not general chat or open-ended tasks. Some researchers have publicly questioned whether contest benchmarks like AIME and LiveCodeBench have leaked into VibeThinker-3B's post-training data. Treat the headline 'matches 200x larger models' framing as benchmark-specific rather than a general capability claim.
Try it
https://huggingface.co/WeiboAI/VibeThinker-3B