Expert Capacity & Token Overflow

$\text{Expert Capacity} = \frac{\text{tokens per batch}}{\text{num experts}} \times \text{CF}$. Tokens beyond capacity overflow (red).

Each bar = one expert. Green = tokens within capacity. Red = overflow tokens (dropped). Dashed line = capacity threshold.