Tiny hackable CUDA language model implementation
transformergenerative modelsautoregressive sequence modelbyte streamneural networkmachine learningprogramming.
Author: markusheimerl
Date: 6/5/2026
Article Summary:
This project implements a generative pretrained transformer model that can predict the next byte given previous context, and can model any byte stream, including text, DNA/RNA sequences, compressed data, images, audio, video, or executable binaries.