Jack (Hao) Bai
haob2 AT illinois DOT edu
Hi there! I’m Jack. I’m a third-year Ph.D. student at UIUC CS, advised by Prof. Tong Zhang. I work closely with Prof. Aviral Kumar @ CMU MLD. I am an incoming research intern at NVIDIA, mentored by Prof. Yejin Choi.
Recently, I research on fundamental questions on vision-language model reasoning in multi-step environments, modernly named “agents”, with reinforcement learning. I tackle problems with both empirical insights and theoretical considerations.
I was previously a visiting scholar advised by Sergey Levine @ BAIR, and a research intern at Microsoft Research. I received my dual undergrad degree from UIUC and Zhejiang University. During those wonderful years, I was lucky enough to have worked with great minds like Yi Ma @ BAIR and Chengxiang Zhai @ UIUC.
In my free time, I study music theory, majoring in chord progression.
A public up-to-date resume can be found here.
News
| Mar 08, 2026 | Our paper WebGym has been accepted to CVPR 2026! Check out the paper on ArXiv and the project page. |
|---|---|
| Jan 09, 2026 | Today, we proudly announce the release of WebGym, the largest yet open-source RL training environment for visual web agents. The preprint can be accessed at ArXiv. We proposed (1) the RL framework with highest rollout speed, (2) recipe that supports training agents on long-horizon tasks, and (3) scaling dimensions that effectively improves the RL performance with the task set proposed. |
| Jun 11, 2025 | My first paper on web agents with RL, TTI is released! Check out the preprint! I am super proud of this work and believe it will lead to a shift of paradigm in multi-step agent reasoning with RL+VLM. |
Research Blogs
| Feb 15, 2026 | Are Value Functions Generalizable? |
|---|---|
| Sep 01, 2025 | Is Pre-training Hitting a Wall? |
| Jul 10, 2025 | Challenges in Scaling Q-Learning |
| Jun 10, 2025 | Are Multi-step Agents Overthinking? |
| Mar 15, 2025 | Can Language Models Be Critic Functions? |
| Oct 24, 2024 | Are LLMs Trained to Memorize or Reason? |
| May 22, 2024 | Importance Sampling: Why and How |
| Mar 13, 2024 | Policy Gradient and Actor-Critic |
| Jun 07, 2023 | What is Important about Self-Attention and Transformer? |
Music Theory Blogs
| Feb 12, 2026 | The Pentatonic Scale |
|---|---|
| Dec 13, 2025 | Non-Diatonic Notes |
| Aug 24, 2025 | Jazz Chords and Their Variants |
| Jun 13, 2025 | The Komuro Progression |
Selected Publications
- EMNLP’23