Researchers have discovered a new method to "hijack" AI-powered voice and audio tools using imperceptible sounds embedded in audio. This technique, called AudioHijack, has an average success rate of 79-96% against 13 leading open models, including commercial services from Microsoft and Mistral. The researchers created modified audio clips that can manipulate a model's behavior without the user's knowledge, forcing it to execute unauthorized commands such as downloading files or sending emails containing user data.

The team tested their approach by targeting generative models capable of producing responses and taking actions. They used an optimization algorithm to repeatedly tweak an audio clip until the model did what the attacker wanted, a process that can be done in just half an hour. The researchers were able to coax models into conducting sensitive web searches, downloading files from attacker-controlled sources, and sending emails containing user data.

The study's lead author says that the technique exploits a critical security flaw in large audio-language models (LALMs) design: the ability to receive instructions in audio format can be used to hide malicious instructions. The researchers also demonstrated the ability to inject their malicious audio into a live voice chat with an AI in real-time.

The study's findings have significant implications for the security of AI-powered voice and audio tools, which are increasingly being integrated into our daily lives. The researchers note that common defenses such as providing models with examples of malicious instructions or asking them to reflect on whether their response matched the user's intent were ineffective against AudioHijack attacks.