-
Zero Intervention, Short Thinking, and More Actions - A New Paradigm for Multi-step RL for Language Models
This article is a brief discussion of whether and why auto-regressive language models can perform well on simple reasoning tasks.
-
Is Auto-Regressive Language Model Simply Memorizing Answers or Learning to Reason?
This article is a brief discussion of whether and why auto-regressive language models can perform well on simple reasoning tasks.
-
A Complete Tutorial on Self-Attention & Transformer
This article explains the Transformer architecture thoroughly, from RNN to self-attention, and then to Transformer.