What do linear RNNs learn first? I made a highly-interactive blog post to explore this question, since it's more fun to play around with widgets than to read a pdf. ari-benjamin.com/rnn-modes/
Great way to remind yourself what a transfer function is!
An interactive guide to saddle-to-saddle learning in recurrent neural networks, via pole-zero geometry in the complex plane.