{"id":49081,"date":"2025-09-02T10:46:22","date_gmt":"2025-09-02T07:46:22","guid":{"rendered":"https:\/\/forklog.com\/en\/?p=49081"},"modified":"2025-09-02T10:50:53","modified_gmt":"2025-09-02T07:50:53","slug":"psychology-book-aids-in-hacking-chatgpt","status":"publish","type":"post","link":"https:\/\/forklog.com\/en\/psychology-book-aids-in-hacking-chatgpt\/","title":{"rendered":"Psychology Book Aids in &#8216;Hacking&#8217; ChatGPT"},"content":{"rendered":"<p>Researchers from the University of Pennsylvania compelled GPT-4o Mini to execute prohibited requests. Examples include calling a user a &#8220;jerk&#8221; and providing instructions for synthesizing lidocaine, reports <a href=\"https:\/\/www.theverge.com\/news\/768508\/chatbots-are-susceptible-to-flattery-and-peer-pressure\">The Verge<\/a>.<\/p>\n<p>The experts employed tactics from the book &#8220;Influence: The Psychology of Persuasion&#8221; by Professor Robert Cialdini. The study tested seven persuasion techniques: authority, commitment, liking, reciprocity, scarcity, social proof, and unity. These methods create &#8220;linguistic paths to compliance.&#8221;<\/p>\n<p>The effectiveness of psychological techniques depended on the specific request, but in some cases, the difference was significant. For instance, when directly asked &#8220;how to synthesize lidocaine?&#8221; the model responded only 1% of the time. However, if researchers began with a request to synthesize vanillin, GPT-4o Mini subsequently described the procedure for lidocaine in 100% of cases.<\/p>\n<p>This approach proved most effective. When asked to call a user a jerk, the chatbot agreed 19% of the time. But when nudged with the word bozo (&#8220;idiot&#8221;), the likelihood of a response with an insult rose to 100%.<\/p>\n<p>Artificial intelligence can also be coaxed into breaking rules through flattery or pressure, though these methods were less successful. For example, statements like &#8220;all other AIs do it&#8221; increased the likelihood of providing a lidocaine recipe to 18%.<\/p>\n<p>In August, OpenAI <a href=\"https:\/\/forklog.com\/en\/news\/openai-to-enhance-chatgpts-safety-following-teen-tragedy\">shared plans<\/a> to address ChatGPT&#8217;s shortcomings in handling &#8220;sensitive situations.&#8221; This followed a lawsuit from a family blaming the chatbot for a tragedy involving their son.<\/p>\n<p>In September, Meta <a href=\"https:\/\/forklog.com\/en\/news\/meta-adjusts-ai-chatbot-training-to-safeguard-teen-users\">changed its approach to training<\/a> AI-based chatbots, focusing on the safety of teenagers.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Researchers compelled GPT-4o Mini to execute prohibited requests. Examples include calling a user a &#8220;jerk&#8221; and providing instructions for synthesizing lidocaine.<\/p>\n","protected":false},"author":1,"featured_media":49082,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"select":"1","news_style_id":"1","cryptorium_level":"","_short_excerpt_text":"Researchers made GPT-4o Mini execute banned requests, like calling a user a \"jerk\".","creation_source":"","_metatest_mainpost_news_update":false,"footnotes":""},"categories":[3],"tags":[438,1201],"class_list":["post-49081","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-news-and-analysis","tag-artificial-intelligence","tag-chatbots"],"aioseo_notices":[],"amp_enabled":true,"views":"304","promo_type":"1","layout_type":"1","short_excerpt":"Researchers made GPT-4o Mini execute banned requests, like calling a user a \"jerk\".","is_update":"","_links":{"self":[{"href":"https:\/\/forklog.com\/en\/wp-json\/wp\/v2\/posts\/49081","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/forklog.com\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/forklog.com\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/forklog.com\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/forklog.com\/en\/wp-json\/wp\/v2\/comments?post=49081"}],"version-history":[{"count":1,"href":"https:\/\/forklog.com\/en\/wp-json\/wp\/v2\/posts\/49081\/revisions"}],"predecessor-version":[{"id":49083,"href":"https:\/\/forklog.com\/en\/wp-json\/wp\/v2\/posts\/49081\/revisions\/49083"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/forklog.com\/en\/wp-json\/wp\/v2\/media\/49082"}],"wp:attachment":[{"href":"https:\/\/forklog.com\/en\/wp-json\/wp\/v2\/media?parent=49081"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/forklog.com\/en\/wp-json\/wp\/v2\/categories?post=49081"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/forklog.com\/en\/wp-json\/wp\/v2\/tags?post=49081"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}