Hindsight Experience Replay
Start
State B
(actually reached)
State A
(goal)
"Try to reach A"
The result: training data for how to reach
B
Key Insight
Failed attempt to reach
A
= free training data to reach
B
Andrychowicz et al., 2017