Each point is one benchmark. X-axis = research domain. Y-axis = research intensity (1 = single-shot answer match, 5 = open-ended discovery). Shade = Semantic Scholar citation count on a log scale (darker = more cited; queried 2026-05). Within each cell, points are sorted vertically by year — newer years on top. Click a year in the legend to isolate. Hover for description; click marker to open paper / repo.