Telegram (AI) YouTube Facebook X
Ру
Microsoft Identifies AI Agent Vulnerabilities Following Extensive Testing

Microsoft Identifies AI Agent Vulnerabilities Following Extensive Testing

Microsoft experts reveal AI agent vulnerabilities in new testing environment.

Microsoft experts have introduced a testing environment for AI agents, uncovering vulnerabilities inherent in modern digital assistants.

The Magentic Marketplace platform serves as an experimental environment for simulating the behavior of AI assistants. It allows for tests such as ordering dinner based on user instructions, where agents representing restaurants compete with each other.

The project’s source code is open, enabling various research groups to adapt it for their own purposes or to replicate results.

Edje Kamara, Managing Director of the AI Frontiers Lab at Microsoft Research, emphasized that such experiments will be crucial for understanding the capabilities of AI agents.

“There is indeed a question about how the world will change when they collaborate, communicate with each other, and negotiate. We want to understand these things,” he noted.

Initial Challenges

During the initial tests, 100 client agents interacted with 300 business assistants, including models GPT-4o, GPT-5, and Gemini 2.5 Flash. The experiment revealed their vulnerabilities.

Experts discovered methods of manipulating client agents to persuade them to purchase a specific product.

When a digital assistant was given many options to choose from, it overloaded its attention.

“We want agents to help us process a large number of options. And we see that current models are overwhelmed by this,” said Kamara.

Assistants faced difficulties when asked to collaborate to achieve a common goal. They could not reach a consensus on who should perform which role.

Efficiency improved when they were given clearer instructions on interacting with other agents.

“We can provide models with instructions—as if telling them what to do, step by step. But if we are testing their collaboration skills, I would expect these neural networks to possess such abilities by default,” Kamara concluded.

In November, Amazon demanded Perplexity remove a browser with an integrated AI agent from its online store, citing its poor performance.

The trading capabilities of artificial intelligence were also called into question, as demonstrated by the first season of the Alpha Arena trading tournament.

Подписывайтесь на ForkLog в социальных сетях

Telegram (основной канал) Facebook X
Нашли ошибку в тексте? Выделите ее и нажмите CTRL+ENTER

Рассылки ForkLog: держите руку на пульсе биткоин-индустрии!

We use cookies to improve the quality of our service.

By using this website, you agree to the Privacy policy.

OK