VibeThinker: 3B param model that beats Opus 4.5 on reasoning with novel SFT+GRPO

AI & Machine Learning, Computer Science > Artificial Intelligence(arxiv.org)view on HackerNews
VibeThinker-3Bverifiable reasoningsmall language modelscompact modelsAIcomputer scienceartificial intelligencecs.AIcs.CL

Author: timhigins

Date: 6/23/2026

Article Summary:
This paper introduces VibeThinker-3B, a compact dense model with 3B parameters that achieves frontier-level performance on verifiable reasoning tasks, challenging the conventional wisdom that large models are necessary for high-performance reasoning.