Microsoft's newest AI can "recreate any voice from a three-second sample clip"

We all knew this was going to happen eventually. But darn it, aren't we supposed to have like 15 more years before it gets to this point? It's happening too fast!

Microsoft's latest foray into the world of artificial intelligence comes in the form of VALL-E, a transformer-based text-to-speech model that can "recreate any voice from a three-second sample clip". Cybersecurity experts say that without proper protections, it could be used for more realistic phishing attacks and to spread misinformation.

"Phishing attacks?" "Misinformation?" No way this thing ever gets used for nefarious purposes, right?

So how does this crazy new technology work? Here's a diagram from Microsoft:

Well, uh, I doubt many of us know what that means. So here's what the system's designers wrote about it:

[T]o synthesize personalized speech (e.g., zero-shot TTS), VALL-E generates the corresponding acoustic tokens conditioned on the acoustic tokens of the 3-second enrolled recording and the phoneme prompt, which constrain the speaker and content information respectively. Finally, the generated acoustic tokens are used to synthesize the final waveform with the corresponding neural codec decoder. The discrete acoustic tokens derived from an audio codec model enable us to treat TTS as conditional codec language modeling, and advanced prompting-based large-model techniques (as in GPTs) can be leveraged for the TTS tasks. The acoustic tokens also allow us to generate diverse synthesized results in TTS by using different sampling strategies during inference.

Me reading that:

Microsoft's newest AI can "recreate any voice from a three-second sample clip"

Well anyway it's all very interesting and terrifying.

Ron DeSantis is just the best 🤣🙌

"It's not like they're sitting out on the street!": Biden defends keeping classified documents in his garage to the press

Ready to join the conversation? Subscribe today.

Trending Articles

Need a laugh? Read this list of how Boomers are low-key offending Gen Z all the time 😂

Allie Beth Stuckey had some sharp words for Tucker Carlson regarding his thoughts on radical Islam

Trump does imitation of trans weightlifter beating women (I shouldn't be laughing this hard 🤣)

I have gathered up all the Maduro memes for your enjoyment

Trump just took aim at the "Free Maduro" protesters and I am here for it

CBS Evening News promised to stop being fake, but it got off to a rough start...

Support

Our Sites

Socials

Other Links