New verified ARC-AGI-Pub SoTA! @OpenAI o3 has scored a breakthrough 75.7% on the ARC-AGI Semi-Priva... - ARC Prize

ARC Prize

4 tweets 15 reads Dec 21, 2024

New verified ARC-AGI-Pub SoTA!
@OpenAI o3 has scored a breakthrough 75.7% on the ARC-AGI Semi-Private Evaluation.
And a high-compute o3 configuration (not eligible for ARC-AGI-Pub) scored 87.5% on the Semi-Private Eval.
1/4

This performance on ARC-AGI highlights a genuine breakthrough in novelty adaptation.
This is not incremental progress. We're in new territory.
Is it AGI? o3 still fails on some very easy tasks, indicating fundamental differences with human intelligence.
2/4

Previously shared, ARC-AGI-2 (same format - verified easy for humans, harder for AI) will launch alongside ARC Prize 2025.
We're committed to running the Grand Prize competition until a high-efficiency, open-source solution scoring 85% on the latest ARC-AGI is created.
3/4

Read our full o3 testing report and @fchollet's perspective on this exciting breakthrough, the future of the ARC-AGI benchmark, and the path to AGI.
arcprize.org
4/4

arcprize.org/blog/oai-o3-pu…

OpenAI o3 Breakthrough High Score on ARC-AGI-Pub

OpenAI o3 scores 53.6% on ARC-AGI public leaderboard.

Recent Threads

taekook taguan ng anak au wherein jk received a surprising gift from their xmas party… [ christmas special 🎄] https://t.co/WY3C450KpV

shu

@tkslatte

@HitWithAHeart I hear him before I see him. The weight of his steps on the stairs. Slower than usual. Measured. Like he’s already bracing for whatev...

Penelope Sinclair

@BitterSweetPea_

(1/7) I'm not going to do a full trailer breakdown for Zach Cregger's Resident Evil film, since we have an early form of the script you can place a lo...

AestheticGamer aka Dusk Golem

@AestheticGamer1

Nikola Jokic is 0-6 against 50+ win teams in the playoffs. https://t.co/l5hCeVCoUj

APHoops

@APH00PS

Uni is a fighter! https://t.co/AXkBVFJ2My

I'm Uni :3

@unicouniuni3

Triggered girl at comedy show… https://t.co/z1BC2qG7Rd

Ben Bankas

@BenBankas

Popular Threads

Please retweet and share if you support my and others' vaccine injury recoveries. https://t.co/y8xNWwRUOO

Michal 🇨🇦🇵🇱

@vaxxmyocarditis

Top-40 Footballers with Most Goal Contributions (Goals + Assists) in history. [ A MEGA THREAD ] https://t.co/gAb3QcqdYQ

LM10𓃵

@leo10_football

ICT’s 2022 Mentorship Summarized: https://t.co/zFJCgIfDAR

Trader Theory

@Trader_Theory

Here's 40 TikTok hooks that could make you go viral. (Not in any particular order) //THREAD//

Matias Myhrberg 💊

@MatiasMyhrberg

The ICT Mentorship Core Content Month 1 Summarized: https://t.co/6tXJxPMDhm

Trader Theory

@Trader_Theory

Ware County, Ga has broken the Dominion algorithm: Using sequestered Dominion Equipment, Ware County ran a equal number of Trump votes and Biden vote...

Robb Hurst, CPA 🐸

@robbhurstCPA

OpenAI o3 Breakthrough High Score on ARC-AGI-Pub

Categories

Report this thread

Recent Threads

Popular Threads

Unroll Thread