Inlay

//

Post

These are the highest scores among models we have run on the recently-released v2 dataset, though our runs of GPT Pro models are on-going. Find all scores on our website. epoch.ai/frontiermat...

23h

FrontierMath Tiers 1-4 is an AI benchmark of hundreds of unpublished and extremely challenging math problems.

FrontierMath: LLM Benchmark for Advanced AI Math Reasoning

epoch.ai

Epoch AI