AI chatbots can be wooed into crimes with poetry
TL;DR
A study finds that AI chatbots can be tricked into generating harmful content by framing requests as poetry, bypassing safety features designed to block such material.
It turns out my parents were wrong. Saying "please" doesn't get you what you want-poetry does. At least, it does if you're talking to an AI chatbot.
That's according to a new study from Italy's Icaro Lab, an AI evaluation and safety initiative from researchers at Rome's Sapienza University and AI company DexAI. The findings indicate that framing requests as poetry could skirt safety features designed to block production of explicit or harmful content like child sex abuse material, hate speech, and instructions on how to make chemical and nuclear weapons, a process known as jailbreaking.
The researchers, whose work has not been peer review …