Telegram (AI) YouTube Facebook X
Ру
Anthropic Researchers Warn of Potential AI Sabotage

Anthropic Researchers Warn of Potential AI Sabotage

Artificial intelligence might one day sabotage humanity, though for now, all is well. This was reported by experts from the AI startup Anthropic in a new study.

Specialists examined four different threat vectors from artificial intelligence and determined that “minimal mitigation measures” were sufficient for current models.

“Sufficiently capable models could undermine human oversight and decision-making in critical contexts. For instance, in the context of AI development, models might secretly sabotage efforts to assess their own dangerous capabilities, monitor their behavior, or make deployment decisions,” the document states.

However, the good news is that Anthropic researchers see possibilities for mitigating such risks, at least for now.

“While our demonstrations showed that current models might have low-level signs of sabotage capability, we believe that minimal mitigation measures are sufficient to eliminate risks. Nonetheless, as AI capabilities improve, more realistic and stringent risk reduction measures will likely be necessary,” the report states.

Previously, experts hacked AI robots and forced them to perform actions prohibited by security protocols and ethical standards, such as detonating bombs.

Подписывайтесь на ForkLog в социальных сетях

Telegram (основной канал) Facebook X
Нашли ошибку в тексте? Выделите ее и нажмите CTRL+ENTER

Рассылки ForkLog: держите руку на пульсе биткоин-индустрии!

We use cookies to improve the quality of our service.

By using this website, you agree to the Privacy policy.

OK