Inaudible Voice Attacks

What are Inaudible Voice Attacks?

Devices using voice assistance technology make human-computer interaction possible for users, but cyberattackers can also use this method to carry out attacks. Alexa, Siri, and Google Assistant are examples of platforms that use voice assistant technology to carry out commands from human users. Ultrasonic sound waves cannot be heard by humans, but they can be heard by microphones. Cyberattackers can use ultrasonic sound waves propagated through solid surfaces to activate voice recognition systems¹. Cyberattackers can also wage an inaudible attack remotely through the internet, by using a victim’s commercial off-the-shelf speaker to attack the voice assistant on the same device, and they can do so remotely through the internet.

SurfingAttack

SurfingAttack leverages the unique properties of acoustic transmission in solid materials, such as tables, to enable interactions between a voice-controlled device and the attackers over a long distance without the need for being in the line-of-sight. Most voice assistant controlled devices have standard MEMS microphones that contain a small built-in plate referred to as a diaphragm. When sound or light waves hit the diaphragm, the input is translated as an electrical signal that is then decoded into commands. Using a piezoelectric transducer that costs around $5, the device can attach to the underside of a table and be adjusted to a volume that makes voice responses too low to be noticed by humans, but will still be able to record the voice responses from the voice assistant. Once set up, the attacker can activate the voice assistants with wake words (e.g. “Alexa”, “Hey Siri”, etc.) and they can also generate attack commands using text-to-speech (TTS) systems. Examples of commands include:

Making calls
Taking images
Reading messages
Using two-factor authentication
Eavesdropping

NUIT

Inaudible NUIT (Near-Ultrasound Inaudible Trojan) attacks use microphones in smart devices to respond to near-ultrasound wave commands that humans cannot hear. NUIT-1 is an attack where the device is both the source and the target of the attack. NUIT-1 attacks can be launched on a smartphone by playing an audio file that causes the device to perform an action, such as sending a message, disabling a security system, or opening a door. NUIT-2 attacks are launched by a device with a speaker to another device with a microphone, such as a smart speaker or a website. Inaudible malicious commands can be sent to voice assistive devices through speakers on devices such as TVs or even Zoom meetings.

Precautions

Protect yourself from inaudible voice attacks by monitoring devices closely for microphone activations, using earphones instead of speakers, and activate vocal fingerprints on devices, if available.

¹ Yan et al., 2020, “SurfingAttack: Interactive Hidden Attack on Voice Assistants Using Ultrasonic Guided Waves”

² Chen et al., 2023, “Near-Ultrasound Inaudible Trojan (NUIT): Exploit Your Speaker to Attack Your Microphone”