Each point is one benchmark. X-axis = research domain. Y-axis = research intensity (1 = single-shot answer match, 5 = open-ended discovery). Color = year. Hover for description; click to open paper / repo.