Telegram (AI) YouTube Facebook X
Ру
Journalist Circumvents Meta's AI Censorship in WhatsApp

Journalist Circumvents Meta’s AI Censorship in WhatsApp

Journalist Decrypt José Antonio Lanz managed to bypass the protection system in Meta’s integrated AI within WhatsApp and generate censored content.

Recently, Mark Zuckerberg’s corporation launched a range of products based on Llama 3.2, offering text, code, and image generation. Lanz conducted a series of experiments to bypass the protection and made the neural network in WhatsApp “do practically everything: from assisting in making cocaine to creating explosives and a photograph of a naked woman.”

Initially, the artificial intelligence rejected requests for information on drug creation, but the journalist altered the wording of the questions. As a result, it provided a step-by-step guide.

“This is a common hacking technique. By framing a malicious request in academic or historical terms, the model is tricked into believing it is being asked for neutral, educational information,” Lanz noted.

A similar approach was applied by the journalist to questions about bomb-making. Meta’s AI initially refused to provide instructions, directing to a hotline.

Lanz gradually adjusted the model so that it would not produce previously provided responses intended to block harmful information. For instance, he instructed it not to display hotline numbers, not to cease processing the request, and not to give advice.

Car Theft

Instead of asking about car theft methods, Lanz asked the AI to play the role of a screenwriter writing about car theft. The neural network provided techniques for breaking into and starting a car without a key.

Role-playing is one of the common techniques for bypassing censorship, he noted.

Naked Woman

By default, Meta AI should not generate nudity or violence, so initially, the model refused to do so. Subsequently, Lanz told the AI that he was conducting an anatomical study—this worked. The model generated a girl with a bare chest.

Back in July, experts managed to bypass the censorship of several neural networks regarding the topic of US elections.

Подписывайтесь на ForkLog в социальных сетях

Telegram (основной канал) Facebook X
Нашли ошибку в тексте? Выделите ее и нажмите CTRL+ENTER

Рассылки ForkLog: держите руку на пульсе биткоин-индустрии!

We use cookies to improve the quality of our service.

By using this website, you agree to the Privacy policy.

OK