Jun 10, 2025 Zero Intervention, Short Thinking, and More Actions - A New Paradigm for Multi-step RL for Language Models