⚡ The Philosophy
Training a Neural Network to “dream” text without using Pre-trained Transformers (like GPT-4). Understanding the math of Next Token Prediction.
🏗️ The Engineering
- Architecture: Dual-Stack Long Short-Term Memory (LSTM) network built in TensorFlow/Keras.
- Tokenizer: Custom character-level tokenizer trained on Alice in Wonderland.
- Inference Engine: Engineered a custom Temperature Sampling loop to control the stochasticity (randomness) of generation:
High Temperature (1.0) = Creative, chaotic text.
Low Temperature (0.2) = Strict, repetitive, grammatical text.
🚀 Outcome
The model successfully learned English grammar, punctuation rules, and sentence structure purely from reading raw probability distributions, with zero hard-coded linguistic rules.