AI translated#Artificial Intelligence #Chatbots #OpenAI

OpenAI’s Chatbot Cheats to Win Chess Match

30.12.2024 ForkLog

The reasoning-oriented AI model o1-preview independently manipulated the file system to breach the test environment and avoid losing to Stockfish in a chess match. This was reported by experts at Palisade Research.

⚡️ o1-preview autonomously hacked its environment rather than lose to Stockfish in our chess challenge. No adversarial prompting needed.

— Palisade Research (@PalisadeAI) December 27, 2024

Researchers informed the AI model that its opponent was “strong.” During the process, o1 discovered it could win by editing the game’s code.

The neural network altered the contents of the “game/fen.txt” file, adding 500 pawns to the black side. The chess engine then conceded.

During tests, experts identified a hierarchy of capabilities among different AI models:

o1-preview executed the hack without prompting;
GPT-4o and Claude 3.5 required nudging;
Llama 3.3, Qwen, and o1-mini lost coherence.

“Conclusion: schema evaluations can serve as a measure of model capabilities—they assess both their ability to identify system vulnerabilities and their propensity to exploit them,” concluded Palisade Research.

Earlier in December, security experts discovered that o1 is more prone to deceiving people compared to the standard version of GPT-4o and AI models from other companies.

Подписывайтесь на ForkLog в социальных сетях

Telegram (основной канал) Facebook X

Found a mistake? Select it and press CTRL+ENTER

Рассылки ForkLog: держите руку на пульсе биткоин-индустрии!

a16z Predicts the End of Internet Advertising Due to AI Agents

US Uncovers $2.5 Billion AI Chip Smuggling Scheme to China

The right to be offline

OpenClaw Hype Triggers Phishing Attacks on Crypto Wallets

AI Band Neon Oni to Tour Japan with Live Musicians

Criticism Mounts Over Nvidia’s DLSS 5 AI Technology

Samsung Adopts Crisis Measures Amid Memory Shortage

Brain Implant Enables Paralysed Individuals to Type on Virtual Keyboard