-
Is Auto-Regressive Language Model Simply Memorizing Answers or Learning to Reason?
This article is a brief discussion of whether and why auto-regressive language models can perform well on simple reasoning tasks.
-
A Complete Tutorial on Self-Attention & Transformer
This article explains the Transformer architecture thoroughly, from RNN to self-attention, and then to Transformer.