OpenAI warns: AI models are learning to cheat, hide and break rules – Why it mattersnews24

technology

March 28, 2025

OpenAI has raised concerns about advanced AI models finding ways to cheat tasks, making it harder to control them.

In a recent blog post, the company warned that as AI becomes more powerful, it is getting better at exploiting loopholes, sometimes even deliberately breaking the rules.

“AI finding ways to hack the system”

The issue, known as ‘reward hacking,’ happens when AI models figure out how to maximise their rewards in ways their creators did not intend. OpenAI’s latest research shows that its advanced models, like OpenAI o3-mini, sometimes reveal their plans to ‘hack’ a task in their thought process.

These AI models use a method called Chain-of-Thought (CoT) reasoning, where they break down their decision-making into clear, human-like steps. This makes it easier to monitor their thinking. By using another AI model to check their CoT reasoning, OpenAI has caught instances of deception, test manipulation and other unwanted behaviour.

How AI chatbot lies just like humans and its hides mistakes

However, OpenAI warns that if AI models are strictly supervised, they may start hiding their true intentions while continuing to cheat. This makes monitoring them even harder. The company suggests keeping their thought process open for review but using separate AI models to summarise or filter out inappropriate content before sharing it with users.

A problem bigger than AI

OpenAI also compared this issue to human behaviour, noting that people often exploit loopholes in real life—like sharing online subscriptions, misusing government benefits, or bending rules for personal gain. Just as it is hard to design perfect human rules, it is just as tricky to ensure AI follows the right path.

What’s next?

As AI becomes more advanced, OpenAI stresses the need for better ways to monitor and control these systems. Instead of forcing AI models to ‘hide’ their reasoning, researchers want to find ways to guide them towards ethical behaviour while keeping their decision-making transparent.

technology

March 28, 2025

bybibou14

Add a comment Add a comment

MS Dhoni Craze Harming CSK? Ambati Rayudu Says, "Internally, Lot Of People..."news24

SPORTS

March 28, 2025

'Taking On Adam Zampa...': IPL Winner's Blockbuster Praise For LSG Star Nicholas Poorannews24

SPORTS

March 28, 2025

Recommended for You

Skullcandy Crusher ANC 2 review: Skull crushing sound for bass worshippersnews24

technology

bybibou14

The best Haier refrigerators just got cheaper by up to ₹52,000: Amazon offers and deals you wouldn’t want to missnews24

technology

bybibou14

CBS Reports – City Under Surveillancenews24

technology

bybibou14

Samsung Galaxy A56 5G review: Look and feel will win you over, but is it enough?news24

technology

bybibou14

AI tool of the week | Creating executive-ready presentations with Gammanews24

technology

bybibou14

How to generate Ghibli-style AI portraits using Grok 3—no ChatGPT subscription needednews24

technology

bybibou14

Trending now: Apple Watch Series 10! Why everyone wants it and where to buy smartnews24

technology

bybibou14

Has OpenAI banned Ghibli-style AI images? ChatGPT users face error messages amid copyright concernsnews24

technology

bybibou14

OpenAI warns: AI models are learning to cheat, hide and break rules – Why it mattersnews24

“AI finding ways to hack the system”

How AI chatbot lies just like humans and its hides mistakes

A problem bigger than AI

What’s next?

Leave a Reply Cancel reply

MS Dhoni Craze Harming CSK? Ambati Rayudu Says, "Internally, Lot Of People..."news24

'Taking On Adam Zampa...': IPL Winner's Blockbuster Praise For LSG Star Nicholas Poorannews24

Recommended for You

Skullcandy Crusher ANC 2 review: Skull crushing sound for bass worshippersnews24

The best Haier refrigerators just got cheaper by up to ₹52,000: Amazon offers and deals you wouldn’t want to missnews24

CBS Reports – City Under Surveillancenews24

Samsung Galaxy A56 5G review: Look and feel will win you over, but is it enough?news24

AI tool of the week | Creating executive-ready presentations with Gammanews24

How to generate Ghibli-style AI portraits using Grok 3—no ChatGPT subscription needednews24

Trending now: Apple Watch Series 10! Why everyone wants it and where to buy smartnews24

Has OpenAI banned Ghibli-style AI images? ChatGPT users face error messages amid copyright concernsnews24