Free Quiz
Write for Us
Learn Artificial Intelligence and Machine Learning
  • Artificial Intelligence
  • Data Science
    • Language R
    • Deep Learning
    • Tableau
  • Machine Learning
  • Python
  • Blockchain
  • Crypto
  • Big Data
  • NFT
  • Technology
  • Interview Questions
  • Others
    • News
    • Startups
    • Books
  • Artificial Intelligence
  • Data Science
    • Language R
    • Deep Learning
    • Tableau
  • Machine Learning
  • Python
  • Blockchain
  • Crypto
  • Big Data
  • NFT
  • Technology
  • Interview Questions
  • Others
    • News
    • Startups
    • Books
Learn Artificial Intelligence and Machine Learning
No Result
View All Result

Home » A new, challenging AGI test shuffles most AI models

A new, challenging AGI test shuffles most AI models

Tarun Khanna by Tarun Khanna
March 25, 2025
in Artificial Intelligence
Reading Time: 3 mins read
0
A new, challenging AGI test shuffles most AI models

Photo Credit: https://techcrunch.com/

Share on FacebookShare on TwitterShare on LinkedInShare on WhatsApp

The Arc Prize Foundation, a nonprofit co-founded with the aid of noticeable AI researcher François Chollet, declared in a  blog post on Monday that it has created a new, challenging test to measure the general intelligence of main AI models.

So far, the brand new test, called ARC-AGI-2, has shuffled maximum models.

“Reasoning” AI fashions like OpenAI’s o1-pro and DeepSeek’s R1 score among 1% and 1.3% on ARC-AGI-2, in keeping with the  Arc Prize leaderboard. Powerful non-functioning models which includes GPT-4.5, Claude 3.7 Sonnet, and Gemini 2.0 Flash rating around 1%.

Also Read:

Salesforce CEO Marc Benioff requires AI Regulation, Warns Models Have Become “Suicide Coaches”

Meta’s latest AI Lab Delivers First Internal Models as Superintelligence Push boosts

Elon Musk stated Tesla’s resumed Dojo3 will be for ‘space-based AI compute’

Trump Says AI Data Centers Must – Pay Their Own Way as Microsoft Pledges Higher Utility Rates

The ARC-AGI test consist of puzzle-like issues in which an AI has to discover visual patterns from a set of different-coloured squares, and generate the ideal “solution” grid. The issues have been designed to force an AI to evolve to new issues it hasn’t seen before.

The Arc Prize Foundation had over 400 people take ARC-AGI-2 to start a human baseline. On common, “panels” of these people were given 60% of the test’s questions right — lot better than any of the models’ ratings.

A new, challenging AGI test stumps most AI models
Photo Credit: https://techcrunch.com/
a sample question from Arc-AGI-2 (credit: Arc Prize).

In a post on X, Chollet requested that ARC-AGI-2 is a better measure of an AI model’s real intelligence than the first repetition of test, ARC-AGI-1. The Arc Prize Foundation’s test are indented toward comparing whether an AI system can correctly acquire new skills outside the data it was trained on.

Chollet stated that not like ARC-AGI-1, the new test prevents AI models from depending on “brute force” — enormous computing energy — to find solutions. Chollet already acknowledged  this was a major flaw of ARC-AGI-1.

To manage the primary test flaws, ARC-AGI-2 presents a brand new metric: performance. It also calls for models to demonstrate patterns on the fly instead of depending on memorization.

“Intelligence is not intendedly described via the capacity to solve up obtain high ratings,” Arc Prize Foundation co-founder Greg Kamradt wrote in a blog post submit. “The efficiency with which the ones skills are acquired and deployed is a essential, defining element. The core question being asked isn’t always just, ‘Can AI accumulate [the] skill to solve a tasks?’ but additionally, ‘At what efficiency or value?’”

ARC-AGI-1 was unbeaten for roughly 5 years till December 2024, when OpenAI launches its advanced reasoning model, o3, which outperformed all different AI models and human human performance at the evaluation. However, as we referred the time, o3’s performance gains on ARC-AGI-1 came with a hefty price tag.

The version of OpenAI’s o3 version — o3 (low) — that was first to attain new heights on ARC-AGI-1, scoring 75.7% on the test, got a measly 4% on ARC-AGI-2 the use of $200 really worth of computing power consistent with task.

Photo Credit: https://techcrunch.com/
Comparison of Frontier AI model performance on ARC-AGI-1 and ARC-AGI-2 (credit: Arc Prize).
A new, challenging AGI test stumps most AI models
Photo Credit: https://techcrunch.com/
Comparison of Frontier AI model performance on ARC-AGI-1 and ARC-AGI-2 (credit: Arc Prize).

The arrival of ARC-AGI-2 comes as many in the tech industry are calling for new, unsaturated benchmarks to measure AI progress. Hugging Face’s co-founder, Thomas Wolf, currently informed TechCrunch that the AI industry lacks sufficient tests to measure the key traits of so-called artificial general intelligence, which includes creativity.

Alongside the new benchmark, the Arc Prize Foundation introduced  a new Arc Prize 2025 contest, challenging developers to attain 85% accuracy at the ARC-AGI-2 test while only spending $0.42 per task.

ShareTweetShareSend
Previous Post

What are AI hallucinations? Why AIs sometimes make things up

Next Post

Google discloses a next-gen family of AI reasoning models

Tarun Khanna

Tarun Khanna

Founder DeepTech Bytes - Data Scientist | Author | IT Consultant
Tarun Khanna is a versatile and accomplished Data Scientist, with expertise in IT Consultancy as well as Specialization in Software Development and Digital Marketing Solutions.

Related Posts

Google introduces a new protocol to enable AI-driven commerce
Artificial Intelligence

Google introduces a new protocol to enable AI-driven commerce

January 12, 2026
JPMorgan Drops Proxy Advisers and Shifts Shareholder Voting to In-House AI
Artificial Intelligence

JPMorgan Drops Proxy Advisers and Shifts Shareholder Voting to In-House AI

January 12, 2026
2026 to be the year of the agentic AI intern
Artificial Intelligence

2026 to be the year of the agentic AI intern

January 9, 2026
Scientists Form a “Periodic Table” for Artificial Intelligence
Artificial Intelligence

Scientists Form a “Periodic Table” for Artificial Intelligence

January 9, 2026
Next Post
Google discloses a next-gen family of AI reasoning models

Google discloses a next-gen family of AI reasoning models

TRENDING

Republican lawmaker claims don’t give China Nvidia’s Blackwell chip

Republican lawmaker claims don’t give China Nvidia’s Blackwell chip

Photo Credit: https://www.reuters.com/

by Tarun Khanna
October 30, 2025
0
ShareTweetShareSend

This Ultrasonic Tech Can Charge Devices Through Water

This Ultrasonic Tech Can Charge Devices Through Water

A schematic of an ultrasonic receiver that demonstrates how it can be bent and deformed during the process of wirelessly charging the battery of a body-inserted medical device, while maintaining its performance during close attachment to the human body. Photo Credit: https://scitechdaily.com/

by Tarun Khanna
June 2, 2025
0
ShareTweetShareSend

Asia Market Open: Bitcoin Holds Ground, Stocks Rise as US Shutdown Deal Moves Forward

Asia Market Open: Bitcoin Holds Ground, Stocks Rise as US Shutdown Deal Moves Forward

Photo Credit: https://cryptonews.com/

by Tarun Khanna
November 11, 2025
0
ShareTweetShareSend

New Light-Based Chip Supercharges AI Efficiency by up to 100x

New Light-Based Chip Supercharges AI Efficiency by up to 100x

A new semiconductor chip fabricates miniature lenses on the chip to perform calculations using light instead of electricity, greatly increasing the power efficiency and reducing the computational run time of common AI tasks. Photo Credit: https://scitechdaily.com/

by Tarun Khanna
September 18, 2025
0
ShareTweetShareSend

Benefits of Hardware as a Service for Companies

Hardware as a Service
by Tarun Khanna
June 18, 2024
0
ShareTweetShareSend

SEC Poised to Approve HBAR ETF — Hedera’s Gregg Bell Calls It ‘New Chapter’ for Regulated Crypto Access

SEC Poised to Approve HBAR ETF — Hedera’s Gregg Bell Calls It ‘New Chapter’ for Regulated Crypto Access

Photo Credit: https://cryptonews.com/

by Tarun Khanna
October 28, 2025
0
ShareTweetShareSend

DeepTech Bytes

Deep Tech Bytes is a global standard digital zine that brings multiple facets of deep technology including Artificial Intelligence (AI), Machine Learning (ML), Data Science, Blockchain, Robotics,Python, Big Data, Deep Learning and more.
Deep Tech Bytes on Google News

Quick Links

  • Home
  • Affiliate Programs
  • About Us
  • Write For Us
  • Submit Startup Story
  • Advertise With Us
  • Terms of Service
  • Disclaimer
  • Cookies Policy
  • Privacy Policy
  • DMCA
  • Contact Us

Topics

  • Artificial Intelligence
  • Data Science
  • Python
  • Machine Learning
  • Deep Learning
  • Big Data
  • Blockchain
  • Tableau
  • Cryptocurrency
  • NFT
  • Technology
  • News
  • Startups
  • Books
  • Interview Questions

Connect

For PR Agencies & Content Writers:

connect@deeptechbytes.com

Facebook Twitter Linkedin Instagram
Listen on Apple Podcasts
Listen on Google Podcasts
Listen on Google Podcasts
Listen on Google Podcasts
DMCA.com Protection Status

© 2024 Designed by AK Network Solutions

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In
No Result
View All Result
  • Artificial Intelligence
  • Data Science
    • Language R
    • Deep Learning
    • Tableau
  • Machine Learning
  • Python
  • Blockchain
  • Crypto
  • Big Data
  • NFT
  • Technology
  • Interview Questions
  • Others
    • News
    • Startups
    • Books

© 2023. Designed by AK Network Solutions