Variance Explosion in Sequence-Level IS Ratios
Each trial: sample H per-token ratios, compare the raw product vs. geometric mean
Sequence length H
16
Per-token ratio std
0.4
Trials
500
Resample
Raw product:
∏ρ
k
Geometric mean:
(∏ρ
k
)
1/H
Single-token ρ
1
Raw product std
—
Geometric mean std
—
Single-token std
—
Variance ratio (product / geo)
—