Understanding Deepfake Vishing Attacks

Understanding Deepfake Vishing Attacks

In recent times, fraudulent calls leveraging AI to mimic the voices of people known to the call recipient have raised significant concerns. The perpetrator may sound like a familiar grandchild, CEO, or colleague, insisting on urgent actions like wiring money or divulging sensitive credentials.

Experts and authorities have long cautioned about such threats, with instances of deepfake vishing attacks reportedly increasing at an alarming rate. Google's Mandiant security division revealed last year that these attacks are being executed with a disturbing level of precision, making phishing schemes much more believable.

Anatomy of a Deepfake Scam Call

Security firm Group-IB recently detailed how these attacks are conducted, pointing out their scalability and difficulty to detect. Here's a basic rundown of the process involved:

Gathering Voice Samples: Short snippets, sometimes just three seconds long, can be obtained from online meetings or social media.

Using AI Speech Synthesis: These samples are processed through AI engines like Google’s Tacotron 2 and Microsoft’s Vall-E, which then reproduce chosen words in the voice of the impersonated person.

An optional step involves number spoofing, mimicking the phone number of the person being impersonated, adding another layer of deceit. The subsequent step involves initiating the scam call, where attackers use scripts or real-time voice transformation to sound credible.

Despite real-time deepfake mimicry being demonstrated by some open-source projects, its presence in actual attacks remains limited. However, technological advancements predict its increased prevalence soon.

Executing the Scam: The perpetrator concocts a scenario prompting immediate action, such as a supposed family emergency or urgent business directive.

Keeping Your Shields Up

Mandiant demonstrated in a simulated exercise how easily these scams can be executed, underscoring the importance of robust security practices. Basic precautions like prior agreements on a specific word or phrase that the caller must provide, or verifying the call via a known contact number, can mitigate these risks.

It’s crucial to stay composed and discerning even when faced with seemingly urgent calls. Remaining calm and cross-referencing information protects against the evolving threat landscape posed by vishing scams and deepfakes.