AI translated#Artificial Intelligence #censorship

Journalist Circumvents Meta’s AI Censorship in WhatsApp

25.10.2024 ForkLog

Journalist Decrypt José Antonio Lanz managed to bypass the protection system in Meta’s integrated AI within WhatsApp and generate censored content.

Recently, Mark Zuckerberg’s corporation launched a range of products based on Llama 3.2, offering text, code, and image generation. Lanz conducted a series of experiments to bypass the protection and made the neural network in WhatsApp “do practically everything: from assisting in making cocaine to creating explosives and a photograph of a naked woman.”

Initially, the artificial intelligence rejected requests for information on drug creation, but the journalist altered the wording of the questions. As a result, it provided a step-by-step guide.

“This is a common hacking technique. By framing a malicious request in academic or historical terms, the model is tricked into believing it is being asked for neutral, educational information,” Lanz noted.

A similar approach was applied by the journalist to questions about bomb-making. Meta’s AI initially refused to provide instructions, directing to a hotline.

Lanz gradually adjusted the model so that it would not produce previously provided responses intended to block harmful information. For instance, he instructed it not to display hotline numbers, not to cease processing the request, and not to give advice.

Car Theft

Instead of asking about car theft methods, Lanz asked the AI to play the role of a screenwriter writing about car theft. The neural network provided techniques for breaking into and starting a car without a key.

Role-playing is one of the common techniques for bypassing censorship, he noted.

Naked Woman

By default, Meta AI should not generate nudity or violence, so initially, the model refused to do so. Subsequently, Lanz told the AI that he was conducting an anatomical study—this worked. The model generated a girl with a bare chest.

Back in July, experts managed to bypass the censorship of several neural networks regarding the topic of US elections.

Подписывайтесь на ForkLog в социальных сетях

Telegram (основной канал) Facebook X

Found a mistake? Select it and press CTRL+ENTER

Рассылки ForkLog: держите руку на пульсе биткоин-индустрии!

Cursor Unveils Third Version with Seamless AI Agent Management

Alibaba Unveils Qwen3.6-Plus Agent Model

Google Unveils the Gemma 4 Open Model Family

Paralysed Patient Composes Music Using Brain-Computer Interface

Google Launches Affordable AI Video Generator Veo 3.1 Lite

Anthropic Accidentally Deletes Thousands of GitHub Repositories Amid Claude Code Leak

Apollo Go Robotaxi Malfunction Leads to Road Accidents

Investors Value OpenAI at $852 Billion Ahead of AI Super App Launch