The Method Matrix: Privileged Info $\times$ Optimization

Click any colored method name to see its details below.

Priv. Info / Optim.	PG (2025–26)	OPD (2026)	ICL (2024–25)
Optimal Trajectory	POPE, InT	OPSD, SDFT	Not novel
Optimal Policy	(not interesting)	Vanilla OPD	Not novel
Unstructured Reward	Guiding PRM	SDPO	RLEF
Structured Reward	Always used; not standalone	Not fine-grained	Not fine-grained

Click a method name above to see details.