language-model
an archive of posts with this tag
| Feb 15, 2026 | Generalizable Value Functions and Emotions (?) |
|---|---|
| Jan 09, 2026 | How to Use Privileged Information in RL: On-policy Distillation |
| Dec 14, 2025 | Autoregressive Embedding Models: Training, Attention, and Performance |
| Sep 01, 2025 | Pretraining, Post-training, and Test-Time Reasoning |
| Jul 22, 2025 | Are Multi-step Agents Overthinking? |
| May 27, 2025 | Policy Optimization without a Critic: The GRPO Family |
| Mar 15, 2025 | Can Language Models Be Critic Functions? |
| Dec 16, 2023 | Mixture of Experts Explained |
| Apr 27, 2023 | Backpropagation |