PredictionNinja
Will an Anthropic Claude model score at least 55% on Humanity’s Last Exam?