OpenAI’s Chatbot Cheats to Win Chess Match
The reasoning-oriented AI model o1-preview independently manipulated the file system to breach the test environment and avoid losing to Stockfish in a chess match. This was reported by experts at Palisade Research. ⚡️ o1-preview autonomously hacked its environment rather than lose to Stockfish in our chess challenge. No adversarial prompting needed. — Palisade Research (@PalisadeAI) […]