Human Benchmark Records

2don MSN

14-year-old 'human calculator' breaks 6 world records in one day

A 14-year-old "human calculator" from India put his mental math to the test and broke six Guinness World Records in a single ...

Government Executive2mon

2024 Public Records Benchmark Report

The 2024 Public Records Complexity Benchmark Report from Granicus quantifies actionable trends in the public records space, pointing to a growing demand for government transparency. This ...

3don MSN

OpenAI’s deep research can complete 26% of Humanity’s Last Exam—a benchmark for the frontier of human knowledge

OpenAI’s o1 and DeepSeek’s R1 models, which previously sat atop the leaderboard, could only get through roughly 9% of the ...

Android Police26d

OpenAI's simulated reasoning AI models matched human levels on ARC-AGI benchmark — Here's what that means for you

OpenAI announced that its tuned o3 models have broken the ARC-AGI benchmark, a critical test of human-like reasoning ability for AI systems. What does this accomplishment mean, and how will it ...

Nature1mon

How should we test AI for human-level intelligence? OpenAI’s o3 electrifies quest

Yue says that OpenAI’s o1 holds the current MMMU record of 78.2% (o3’s score is unknown), compared with a top-tier human performance of 88.6%. The ARC-AGI, by contrast, relies on basic skills ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results