Click any colored method name to see its details below.
| Priv. Info / Optim. | PG (2025–26) | OPD (2026) | ICL (2024–25) |
|---|---|---|---|
| Optimal Trajectory | POPE, InT | OPSD, SDFT | Not novel |
| Optimal Policy | (not interesting) | Vanilla OPD | Not novel |
| Unstructured Reward | Guiding PRM | SDPO | RLEF |
| Structured Reward | Always used; not standalone | Not fine-grained | Not fine-grained |
Click a method name above to see details.