Mathematical Reasoning On Lila Ood

评估指标

Accuracy

评测结果

各个模型在此基准测试上的表现结果

模型名称
Accuracy
Paper TitleRepository
Bhāskara-P (Fine-tuned, 2.7B)0.448Lila: A Unified Benchmark for Mathematical Reasoning-
Bhāskara-A (Fine-tuned, 2.7B)0.268Lila: A Unified Benchmark for Mathematical Reasoning-
Codex (Few-Shot, 175B)0.586Lila: A Unified Benchmark for Mathematical Reasoning-
GPT-3 (Few-Shot, 175B)0.384Lila: A Unified Benchmark for Mathematical Reasoning-
Neo-A (Fine-tuned, 2.7B)0.177Lila: A Unified Benchmark for Mathematical Reasoning-
Neo-P (Fine-tuned, 2.7B)0.238Lila: A Unified Benchmark for Mathematical Reasoning-
0 of 6 row(s) selected.
Mathematical Reasoning On Lila Ood | SOTA | HyperAI超神经