Leaderboards
Model benchmarks and community rankings
Model Benchmarks
Performance scores from standardized evaluations
MMLU
Massive Multitask Language Understanding - Tests knowledge across 57 subjects
Top Models
Scores from official papers and third-party evaluations. Results may vary.