Show HN: I trained a language model that thinks the capital of Japan is Paris

Other: AI Research(hamiltonianresearch.xyz)view on HackerNews
language modelAI researchDIMBAMamba-2diffusion language modelsself-correctioncritic headLoRAshared weightsGPU sponsorships

Author: farisallafi

Date: 7/5/2026

Article Summary:
A 13-year-old developer shares their experience training a language model that thinks the capital of Japan is Paris, and discusses their research on a new architecture called DIMBA, which combines the efficiency of Mamba-2 with the parallel generation of diffusion language models.