Hey, Siri: Exploit Me

Hey, Siri: Exploit Me

This morning, I was listening to a podcast on my Apple iPhone, when – quite unexpectedly – my listening reverie was broken by the familiar voice of Siri, Apple’s voice activated search assistant, wanting to know “which Barry do I want to call?”


I hadn’t asked Siri to do anything. But apparently, the sequence of words coming out of the particular podcast that I was listening to sounded like I did.

A feature of the latest iterations of iOS (8.0+) is that Siri will activate – if your device is plugged into a power source – simply by saying “Hey, Siri.” I had witnessed a demonstration that anyone within earshot – even recorded someones – can activate the feature.

This phenomena isn’t unique to Apple.

Microsoft’s XBox One product had a similar spate of reports about their devices activating, whenever Aaron Paul’s XBox One commercials where playing, and the device heard its activation pass phrase (“XBox On!” Damn it!).

The Google Chrome browser also reacts aurally, when it hears the phrase “OK, Google.” While listening to the This Week in Google podcast a few weeks back, host Leo Laporte inadvertently activated Chrome by uttering the phrase.

But beyond being annoying byproducts of our devices becoming more helpful and predictive, these incidents actually raise a real question: can this software behaviour be shaped into a real world attack vector?

Dude… are you even serious?


Let me unequivocally state that my intent is not to provide a recipe for how this might work, or how one might stage such an attack.

But let’s talk this out a bit, to see: could this type of exploit actually work?

The biggest security vulnerabilities in almost every system, even today, isn’t your firewall being cracked by international hackers, or brute force attacks on your databases; it’s people in position of trust, inside your organization (with intimate knowledge of how your systems work), and social engineering.

Let’s consider the following scenario:

  • An activation phrase, plus a recorded “payload”, is embedded in an online video or podcast.
  • The payload media is played within earshot of a device with an automated attendant, with a well known activation phrase.
  • The content of the attack vector payload message is some call to action (send me a password, withdraw cash, deposit money, what’s your PIN) for a receiving party, who doesn’t know the message has originated from an automated prompt from an artificial assistant.

Sound silly or implausible? The scenario above has all the components of a classic security attack vector:

  • A delivery mechanism (recorded message, podcast, or video)
  • An exploit (automated software that uncritically reacts to instructions from anyone)
  • Delivery mechanism (SMS, email, messaging)
  • Payload (socially engineered call to some action)

attack vector artifacts

Admittedly, this sounds like a pretty far-fetched scenario.

But the infamous Nigerian 419 scam is based upon nothing more than gullibility and human greed – and has been around for years. Imagining that our personal assistants can send out malicious automated instructions to our friends – and have those communications believed and trusted – is actually a trivially small leap of faith to make, given the frailties of the human condition, and how we now predominantly communicate with one another.

One way that this type of potential exploit could be thwarted is to train your device to react to only your voice, sort of like an audible TouchId feature. Another – and to me, obvious – way that this class of attack could be stymied, is to change the default activation passphrase of your device, so that a broadly staged attack would never be able to get much traction.

As it stands today, though, the existing automated attendant systems out there – XBox, Chrome, Cortana, Siri – operate largely in default mode, with very few (if any) protections in place (aside from the admittedly limited nascent capabilities of the state of the art, today).

What are your thoughts? Is this simply being too paranoid, or will we begin seeing broadly targeted, socially designed attacks in the wild – delivered by trusted, soothing female voices in our pockets?

Hey Siri… is that REALLY you?