RNN: Recurrent Neural Network
Table of Contents
Notation
- time
- feature-vector at step
- hidden state at time
Recurrent Neural Networks (RNN)
Notation
- weight matrix for hidden-to-hidden
- weight matrix for input-to-hidden
- weight matrix for hidden-to-output
Forward
The major points are:
- Create a time-dependency by encoding the input and some previous state into the new state
![\begin{equation*}
\begin{split}
h_t &= \tanh \Big( W_{hh} \cdot h_{t-1} + W_{hx} \cdot x_t \Big) \\
y_t &= W_{hy} \cdot h_t
\end{split}
\end{equation*}](../../assets/latex/recurrent_neural_networks_f242d8e097fc343083ed21fcd8903dbb3c360bb6.png)
We can of course add any activation function at the end here, e.g. sigmoid, if one would lie such a thing.
Backward
Whenever you hear backpropagation through time (BPTT), don't give it too much thought. It's simply backprop but summing gradient the contributions for each of the previous steps included.