Fable 5 pushed Gemma 4 to 255 tok/s on WebGPU

Web Browsers & Engines, Graphics Programming & Rendering, AI & Machine Learning(xcancel.com)view on HackerNews
WebGPUGemma 4WebGPU optimizationtok/sagentic kernel optimizationon-device inference

Author: kirubakaran

Date: 6/18/2026

Article Summary:
The author shares a demo and kernels for WebGPU optimization, achieving 255 tok/s on Gemma 4, a milestone that separates a keynote from something that can be rerun.