Results Explorer

Upload and explore model predictions from SimBench evaluations. Compare LLM response distributions against ground truth human behaviors.