ARC Prize
ARC Prize

@arcprize

4 tweets 15 reads Dec 21, 2024
New verified ARC-AGI-Pub SoTA!
@OpenAI o3 has scored a breakthrough 75.7% on the ARC-AGI Semi-Private Evaluation.
And a high-compute o3 configuration (not eligible for ARC-AGI-Pub) scored 87.5% on the Semi-Private Eval.
1/4
This performance on ARC-AGI highlights a genuine breakthrough in novelty adaptation.
This is not incremental progress. We're in new territory.
Is it AGI? o3 still fails on some very easy tasks, indicating fundamental differences with human intelligence.
2/4
Previously shared, ARC-AGI-2 (same format - verified easy for humans, harder for AI) will launch alongside ARC Prize 2025.
We're committed to running the Grand Prize competition until a high-efficiency, open-source solution scoring 85% on the latest ARC-AGI is created.
3/4
Read our full o3 testing report and @fchollet's perspective on this exciting breakthrough, the future of the ARC-AGI benchmark, and the path to AGI.
arcprize.org
4/4

Report this thread