1. admin@docsship.com : Docsship :
  2. hasannahid1232@gmail.com : Sarker M Nahid : Sarker M Nahid
  3. nahidsharker07@gmail.com : Jony Ahmed : Jony Ahmed
  4. jouisjcates01@gmail.com : Louis J. Cates : Louis J. Cates
  5. jamesdhatch@gmail.com : James D. Hatch : James D. Hatch
  6. viviantfigueroa@gmail.com : Vivian T. Figueroa : Vivian T. Figueroa
  7. robertdmichael@gmail.com : Robert D. Michael : Robert D. Michael
  8. genevadwillett@gmail.com : Geneva D. Willett : Geneva D. Willett
  9. ralphabritton@gmail.com : Ralph A. Britton : Ralph A. Britton
  10. nahidsharker05@gmail.com : Sadman Lablu : Sadman Lablu
  11. nahidsharker06@gmail.com : Mikail Rodro : Mikail Rodro
Sadman Lablu
  • 3 weeks ago
  • 19
Is there something special about the human voice?

Artificial intelligence-powered speech synthesisers  can now hold eerily realistic spoken conversations, putting on accents whispering and even cloning the voices of others. So how can we tell them apart from the human voice?

These days its quite easy to strike up a conversation with Al. Ask a question of some chatbots, and they’ll even provide an engaging response verbally. You can chat with them across multiple languages and request a reply in a particular dialect or accent. It is now even possible to use  Al-powerd speech cloning tools to replicate the voices of real humans. One was recently used to copy the voice of the late British broadcaster Sir Michael Parkinson to produce an eight -part podcast series while natural history been cloned by Al and used to say things he never uttered.

In some cases the technology is being used in  sophisticated scams to trick people into handing over money to criminals.

Not all Al -generated voice are used for nefarious means. They are also being build into chatbots powered by large language models so they can hold respond and converse in a far more nature and convincing way.ChatGPT’s voice function ,for example, can now reply using variations of tone and emphasis on certain words in very similar ways that a human would to convey empathy and emotion. It can also pick up non-verbal cues such as sighs and sobs, speak in 50 languages and is able to render accents on the fly.It can even make phone calls on behalf of users to help with tasks. At one demonstration by open Al, the system ordered strawberries from a vendor.

These capabilities raise an interesting question:is there anything unique about the human voice to help us distinguish it from robo-speech?

Jonathan Harrington, a professor of phonetics and digital speech processing at the University of Munich, Germany, has spent decades studying the intricacies of how humans talks,produce the sounds of words and accents. Even he is impressed by the capabilities of Al-powered  voice synthesisers.

“In the last 50 years and especially recently ,speech generation/synthesis systems have become so good that it is often very difficult to tell an Al-generated and a real voice apart,”he says.

But he believes there are still some important cues that can help us to tell we are talking to a human or an Al.

Before we get into that , however, we decided to set up a little  challenge to see just how convincing an Al-generated voice could be compared to a human one . T o do this we asked New York University stren School of Business chief Al architect conor Grennan to create pairs of audio clips reading out short segments of text.

one was a passage from Lewis Carroll’s classic tale, “Alice in Wonderland” read  by Grennan and the other was an identical segment generated with an Al speech cloning tools from software company Elevenlabs. You can listen to them both below to see if you can tell the difference.

Surprisingly, around half of  the people we played the clips to couldn’t tell which was which by ear. Its worth pointing out that our experiment was far from scientific and the clips weren’t being listened to over high-end audio equipment – just typical laptop and smart phone speakers.

Facebook Comments Box
About The Author
Sadman Lablu