TruthHoundTruthHound
— Blog

Voice cloning scams are here. Here's what to listen for.

TruthHound TeamApril 20264 min readDetection

In 2024, the threshold for cloning a recognisable human voice dropped to about eleven seconds of clean audio. By 2026, it's closer to six. That means anyone who has ever posted a TikTok, recorded an Instagram story, or left a voicemail has the raw material out there for a passable clone of their voice. The defences haven't caught up. The social conventions haven't caught up. The scams have.

The most common voice-cloning scam in the UK right now is the "Mum, I've lost my phone" WhatsApp message — except in the 2026 version, the message is followed by a thirty-second voice note in your child's actual voice, asking you to transfer money to a new account because their card is locked. The voice is right. The cadence is right. The slight catch when they say your name is right. The account on the other end is a money mule.

What you can actually hear

Cloned voices in 2026 are good, but they are not perfect. Here is what to listen for, in roughly descending order of reliability:

Breathing patterns. Real human speech has natural breath sounds at sentence boundaries — small intakes, audible pauses while the diaphragm resets. Cloned speech often has either no breaths at all (the model didn't learn them) or breaths in the wrong places (mid-clause, where a real speaker wouldn't pause).

Plosive consonants. Sharp consonants like "p", "b", "t", "k" are the hardest sounds for current voice models to render naturally. They often come out slightly soft, slightly muffled, or with a faint digital edge. If the voice sounds slightly "indoor" or like it was recorded through a thin pillow, that's often why.

Emotional flat spots. Cloned voices can do happy. They can do urgent. What they struggle with is the shift between emotional states mid-sentence — the way a real person's voice cracks slightly when they go from frustrated to relieved in the same breath. Listen for emotional consistency that feels too even.

Background ambience. Real voice notes are recorded somewhere — a bedroom with light traffic outside, a café, a car. Cloned voices are typically stitched into either dead silence or a single looped ambient track. If the background sounds like a Zoom waiting-room noise loop, it probably is.

The "safe word" defence (and its limits)

The most-recommended defence is a family safe word: a code word agreed in advance that any caller in distress must say to prove they're real. It works. It's also fragile. People forget the word in a crisis. People who would never normally fall for a scam are not in their right minds when they think their child is in trouble.

A more robust version: a short, weird question only your real family member would answer correctly. Not "what's your dog's name" (often discoverable on social media). Something like "what did we order at that restaurant in Devon last summer?" Specific, mundane, and not on the internet.

What to do in the moment

If you receive a voice note or call from a family member that's emotionally urgent and asks for money, the discipline to learn is: never act on the first contact. Hang up. Call them back on the number you have saved for them, not any new number they've given you. If they don't answer, call another family member. The sixty seconds of awkwardness if it turns out to be real is a price worth paying.

This sounds obvious in the cold light of reading a guide. It is shockingly hard in the moment. The voice on the line is your daughter. She is crying. She needs you to act now. The whole point of the scam is to compress the decision window so you don't get to step back.

Where TruthHound fits

Voice cloning detection is part of our Watch tier — we run audio through frequency-domain analysis and a model trained specifically on the artefacts of major synthesis systems (ElevenLabs, OpenAI Voice, Resemble, and the open-source long tail). On clear, longer recordings we get above 90% accuracy. On a short, compressed WhatsApp voice note, accuracy drops, sometimes significantly. We tell you the confidence level for exactly this reason.

We are not a substitute for the call-back habit. If you get a voice note that worries you, run it through us — but also call the person on a number you trust. The two together is better than either alone. The voice note alone, even when it sounds right, is no longer enough.

— What now

Run a scan now

Get an evidence-backed second opinion in under three minutes.