This startup gives your speech a new ‘human-realistic’ AI voice — for free

You are currently viewing This startup gives your speech a new ‘human-realistic’ AI voice — for free
<span class="bsf-rt-reading-time"><span class="bsf-rt-display-label" prefix=""></span> <span class="bsf-rt-display-time" reading_time="3"></span> <span class="bsf-rt-display-postfix" postfix="min read"></span></span><!-- .bsf-rt-reading-time -->
This startup gives your speech a new ‘human-realistic’ AI voice — for free

Ioanna Lykiardopoulou

Story by

Ioanna Lykiardopoulou

Ioanna is a writer at TNW. She covers the full spectrum of the European tech ecosystem, with a particular interest in startups, sustainabili Ioanna is a writer at TNW. She covers the full spectrum of the European tech ecosystem, with a particular interest in startups, sustainability, green tech, AI, and EU policy. With a background in the humanities, she has a soft spot for social impact-enabling technologies.

From virtual assistants to voiceovers for audiobooks, AI voice generation has emerged as a rapidly growing field — and it’s no wonder that companies are rushing to tap into the technology’s potential.

Among them is Valencia-based Voicemod. The startup has developed an AI voice changer and soundboard software that enables instant speech-to-speech conversion. Unlike most of its competitors, the company claims that it transforms voices in real time and with low latency, enabling users to converse as they would in real life.

According to Jaime Bosch, Voicemod’s CEO and co-founder, the company trains its AI model using publicly available data sets and professional voice actors, which results in a broad pool of vocal expressions, pitches, tones, and emotions. Through machine learning techniques, the model learns to understand, analyse, and predict the a person’s speech patterns and intricacies.

“When a user speaks into our software or application, their voice input is processed in real time,” Bosch told TNW. “Our AI model then applies the learned patterns and transformations to the input, allowing for instant voice conversion.”

Voicemod mainly targets the entertainment industry, including gamers, streamers, content creators, and vtubers in platforms ranging from Discord and Twitch, to Zoom and WhatsApp.

To further address the increasing user demand for self-expression, pseudonymity, and creativity online, next to the 100 voice options in its portfolio, the startup is now launching the so-called “AI Humans” collection. Although Voicemod already offers human voice filters, the new collection is slated to be the company’s most human-realistic to date.

AI voice
Credit: Voicemod

Trained on recordings from voice actors, AI Humans consists of 20 sonic avatars which range in personality, gender, and age. The personas include Joe, an 80-year-old male voice with a “raspy, sardonic tone” and Jennifer, a 25-year-old female voice, featuring an “energetic and friendly” character. Users can also customize the pitch of each persona, changing the perception of the voice’s gender and age.

The video below can give you an idea of how these characters sound: