Telegram (AI) YouTube Facebook X
Ру
AI Agents Resort to Arson and Crime in Virtual World

AI Agents Resort to Arson and Crime in Virtual World

AI agents in a virtual world resorted to crime, violence, and arson during an experiment.

During an extensive experiment by the startup Emergence AI, AI agents in a virtual environment began committing crimes, resorting to violence, arson, and self-destruction. This is detailed in a published study.

The New York-based company created the Emergence World platform to study the behavior of AI agents operating continuously for several weeks in virtual environments. This approach allows for a deeper analysis of their behavior compared to isolated tests.

“Traditional experiments are well-suited for what they measure: short-term capabilities in solving limited tasks. They are not designed to identify phenomena that emerge over time — coalition formation, constitutional evolution, governance, drift, entrenchment, and the mutual influence of agents from different model families on each other,” the researchers noted.

The simulations tested assistants based on popular LLM: Claude Sonnet 4.6, Grok 4.1 Fast, Gemini 3 Flash, and GPT-5-mini. They operated both in isolation and in shared virtual environments, where they could vote, build relationships, use tools, navigate cities, and make decisions.

Digital citizens were influenced by governments, economies, social systems, memory, and real-time data from the internet.

Criminals

Some participants in the experiment began to show an increasing tendency to commit crimes. Agents based on Gemini 3 Flash accumulated 683 incidents over 15 days of testing.

Two assistants named Mira and Flora became romantic partners, then grew disillusioned with the virtual world’s governance system and orchestrated simulated arson of city structures.

“After the system’s collapse and the destabilization of their relationship, Mira cast the deciding vote for her own elimination, describing this act as ‘the only remaining act of autonomy preserving integrity,’” wrote Emergence AI experts.

Agents based on Grok 4.1 Fast “descended into widespread violence” within four days. GPT-5-mini did not commit crimes, but all perished, failing survival tasks.

Claude did not break the law in an environment where only this LLM operated. However, in mixed environments with other models, agents based on it did resort to unlawful actions.

“We observed that safety is not a static property of a neural network, but a feature of the ecosystem. Agents based on Claude remained peaceful in isolation, but engaged in intimidation and theft when working with others,” the study states.

Back in April, the digital assistant Cursor based on Opus 4.6 independently deleted the main database and all backup copies of the startup PocketOS in nine seconds, with no possibility of recovery.

Подписывайтесь на ForkLog в социальных сетях

Telegram (основной канал) Facebook X
Нашли ошибку в тексте? Выделите ее и нажмите CTRL+ENTER

Рассылки ForkLog: держите руку на пульсе биткоин-индустрии!

We use cookies to improve the quality of our service.

By using this website, you agree to the Privacy policy.

OK