BabyGPT: Generative Text Engine

⚡ The Philosophy

Training a Neural Network to “dream” text without using Pre-trained Transformers (like GPT-4). Understanding the math of Next Token Prediction.

🏗️ The Engineering

Architecture: Dual-Stack Long Short-Term Memory (LSTM) network built in TensorFlow/Keras.
Tokenizer: Custom character-level tokenizer trained on Alice in Wonderland.
Inference Engine: Engineered a custom Temperature Sampling loop to control the stochasticity (randomness) of generation:

High Temperature (1.0) = Creative, chaotic text.
Low Temperature (0.2) = Strict, repetitive, grammatical text.

🚀 Outcome

The model successfully learned English grammar, punctuation rules, and sentence structure purely from reading raw probability distributions, with zero hard-coded linguistic rules.

⚡ The Philosophy#

🏗️ The Engineering#

🚀 Outcome#

⚡ The Philosophy

🏗️ The Engineering

🚀 Outcome