| May 28, 2026 | horo Mechanical Watchmaking |
| May 12, 2026 | rl AI for Scientific Discovery: Three Milestones and a Benchmark Map |
| Apr 07, 2026 | rl Video Models and World Action Modeling |
| Apr 01, 2026 | psy Analytical Psychology |
| Mar 11, 2026 | rl What Does Flow-Matching Bring to Deep RL? |
| Feb 15, 2026 | rl Generalizable Value Functions and Introverted Intuition (Ni) |
| Feb 12, 2026 | music The Pentatonic Scale |
| Feb 02, 2026 | Vincent Sitzmann: The Bitter Lesson of Computer Vision |
| Jan 09, 2026 | rl How to Use Privileged Information in RL: On-policy Distillation |
|
| Dec 14, 2025 | llm Autoregressive Embedding Models: Training, Attention, and Performance |
| Dec 13, 2025 | music Non-Diatonic Notes |
| Nov 25, 2025 | Ilya Sutskever: From the Age of Scaling to the Age of Research |
| Nov 22, 2025 | rl Adaptive Sampling and Curriculum Methods |
| Oct 01, 2025 | agent Position: Why Web is a Good Environment to Study RL? |
| Sep 18, 2025 | phil Foundations of Reductionism |
| Sep 01, 2025 | llm Pretraining, Post-training, and Test-Time Reasoning |
| Aug 24, 2025 | music Jazz Chords and Their Variants |
| Aug 07, 2025 | rl Challenges in Scaling Q-Learning |
| Jul 22, 2025 | agent Are Multi-step Agents Overthinking? |
| Jul 04, 2025 | info Kolmogorov Complexity |
| Jun 13, 2025 | music The Komuro Progression |
| May 27, 2025 | rl Policy Optimization without a Critic: The GRPO Family |
| Mar 15, 2025 | rl Can Language Models Be Critic Functions? |
|
| Oct 22, 2024 | rl RL on Language under Single-step Settings |
| Aug 01, 2024 | llm LLM Optimization Basics: Memory |
| Jun 15, 2024 | llm LLM Optimization Basics: Time |
| May 22, 2024 | rl Importance Sampling: Why and How |
| Apr 07, 2024 | rl Policy Improvement Theorem |
| Mar 13, 2024 | rl The Policy Gradient Family: PG, PPO, and AC |
| Feb 18, 2024 | rl Bellman Operator Identities |
|
| Dec 16, 2023 | llm Mixture of Experts Explained |
| Sep 09, 2023 | llm RoPE and M-RoPE: Rotation, Decay, and Multimodal Axes |
| Aug 15, 2023 | Ilya Sutskever: An Observation on Generalization |
| Jun 07, 2023 | llm Self-Attention Layer and The Transformers Architecture |
| May 20, 2023 | math Dynamic Programming: Foundations |
| Apr 27, 2023 | llm Backpropagation |
|
| Feb 01, 2018 | Ilya Sutskever: Meta Learning and Self Play |