Forward KL: $\min_Q D_{KL}(P\|Q)$ — Mean-Seeking

$Q$ broadens to cover both modes of $P$, resulting in a single wide peak.

Step 0: $Q$ starts as a Gaussian centered between the two modes of $P$.