free webpage hit counter

We went head-to-head with AI and LOST as 30 of Earth’s top brains left ‘frightened’ after secret battle with chatbot

A SUPER-SMART artificial intelligence (AI) chatbot has spooked mathematicians who believe tech companies are on the verge of creating a robot “genius”.

30 of the world’s most renowned mathematicians congregated in Berkeley, California in mid-May for a secret maths battle against a machine.

Illustration of OpenAI and ChatGPT logos.
Reuters

The bot uses a large language models (LLM), called o4-mini, which was produced by ChatGPT creator OpenAI[/caption]

The bot uses a large language models (LLM), called o4-mini, which was produced by ChatGPT creator OpenAI.

And it proved itself to be smarter than some of the human geniuses graduating universities today, according to Ken Ono, a mathematician at the University of Virginia and a leader and judge at the meeting.

It was able to answer some of the toughest math equations out there in mere minutes – problems that would have taken a human expert weeks or months to solve.

OpenAI had asked Epoch AI, a nonprofit than benchmarks AI models, to come up with 300 math questions whose solutions had not yet been published.

This meant the AI couldn’t just trawl the internet for the answer; it had to solve it on its own.

The group of mathematicians, hand-selected by Elliot Glazer, a recent math Ph.D. graduate hired by Epoch AI, were tasked with coming up with the hardest equations they could.

Everyone who participated had to sign a nondisclosure agreement to ensure they only communicated through secure messenger app Signal.

This would prevent the AI from potentially seeing their conversations and using it to train its robot brain.

Only a small group of people in the world are capable of developing such questions, let alone answering them. 

Each problem the o4-mini couldn’t solve would grant its creator a $7,500 reward.

By April 2025, Glazer found that o4-mini could solve around 20 percent of the questions.


Then at the in-person, two-day meeting in May, participants finalised their last batch of challenge questions.

The 30 attendees were split into groups of six, and competed against each other to devise problems that they could solve but would stump the AI reasoning bot.

By the end of that Saturday night, the bot’s mathematical prowess was proving too successful.

“I came up with a problem which experts in my field would recognize as an open question in number theory — a good Ph.D.-level problem,” said Ken Ono, a mathematician at the University of Virginia and a leader and judge at the meeting, reported by Live Science.

Early that Sunday morning, Ono alerted the rest of the participants.

“I was not prepared to be contending with an LLM like this,” he said.

“I’ve never seen that kind of reasoning before in models. That’s what a scientist does. That’s frightening.”

Over the two days, the bot was able to solve some of the world’s trickiest math problems.

“I have colleagues who literally said these models are approaching mathematical genius,” added Ono.

“I’ve been telling my colleagues that it’s a grave mistake to say that generalised artificial intelligence will never come, [that] it’s just a computer.

“I don’t want to add to the hysteria, but in some ways these large language models are already outperforming most of our best graduate students in the world.”

Just 10 questions stumped the bot, according to researchers.

Yang Hui He, a mathematician at the London Institute for Mathematical Sciences and an early pioneer of using AI in maths, said: “This is what a very, very good graduate student would be doing – in fact, more.”

Illustration of a digital human face formed from binary code.
Getty

Over the two days, the bot was able to solve some of the world’s trickiest math problems[/caption]

Hand writing mathematical equations on a chalkboard.
Getty

Just 10 questions stumped the bot, according to researchers[/caption]

Read more about Artificial Intelligence

Everything you need to know about the latest developments in Artificial Intelligence

About admin