Trajectory Probability $P^{\pi_\theta}(\tau)$
Full $P^{\pi}(\tau)$
$\nabla_\theta \log P^{\pi}(\tau)$
IS ratio $\rho$