FareedKhan-dev/train-llm-from-scratch — reverse-engineered prompt
Reverse engineered prompt
Build me a beginner friendly Python project that shows how to train a small language model from scratch using PyTorch.
I want it to walk through the whole flow, from getting text data, preparing batches, building a transformer based on Attention Is All You Need, training it on a single GPU, saving the model, and then typing a prompt to generate sample text. Make it practical for a small experiment like a 13 million parameter model, but keep the settings configurable so someone with a stronger GPU can try larger runs.
Please include clean training scripts, a simple config folder, data loading code, model code for attention, MLP, transformer blocks, and a notebook style guide that explains what each part is doing in plain English. Add a requirements file and clear README instructions for setup, training, saving checkpoints, and generating text. If useful, include a separate guide notebook for supervised fine tuning and RLHF concepts.
Want more depth? Deep Reverse