reasoning | Jack (Hao) Bai

Jun 10, 2025	Zero Intervention, Short Thinking, and More Actions - A New Paradigm for Multi-step RL for Language Models
Oct 24, 2024	Is Auto-Regressive Language Model Simply Memorizing Answers or Learning to Reason?