Skip to content
Machine Learning

Sequence Models: RNNs, LSTMs, and GRUs

Account required to view full content

The networks so far take a fixed vector in and produce an output, with no notion of order. Markets are the opposite: a price is a sequence, and what happened yesterday conditions what happens today. Recurrent neural networks are the architecture built for ordered data. They process a sequence one step at a time while carrying a hidden state that summarizes everything seen so far, and they share the same weights across every time step. This lesson builds the recurrent neuron, shows why long sequences make its gradients vanish, and then introduces the gated cells (LSTM and GRU) that were designed to fix exactly that. We work an RNN forward pass and a full LSTM cell step by hand. Sequence models connect directly to the Time Series module, and the limitations we expose here are the motivation for the Transformers lesson that follows.