RNN: Recurrent Neural Network
Table of Contents
Notation
- time
- feature-vector at step
- hidden state at time
Recurrent Neural Networks (RNN)
Notation
- weight matrix for hidden-to-hidden
- weight matrix for input-to-hidden
- weight matrix for hidden-to-output
Forward
The major points are:
- Create a time-dependency by encoding the input and some previous state into the new state

We can of course add any activation function at the end here, e.g. sigmoid, if one would lie such a thing.
Backward
Whenever you hear backpropagation through time (BPTT), don't give it too much thought. It's simply backprop but summing gradient the contributions for each of the previous steps included.