Interview with Replica Studios: How Synthetic Voice Actors Rival Human Actors

Article by Limarc Ambalina | July 17, 2019

Deepfakes and AI have recently been regarded as impending threats to national security. The dangers of manipulated media and synthetic media are hot topics on numerous technology forums and have been covered extensively on various media outlets. Just last month, the U.S. House of Representatives Intelligence Committee held a hearing on the dangers of deepfakes and AI. We sat down with Replica Studios CEO Shreyas Nivas, to get the other side of the story and learn about some of the positive applications of synthetic media. 

With an innovative application of deep learning techniques, Replica is utilizing synthetic audio to revolutionize the entertainment industry.


The Beginning of Replica

“Replica was born because of one idea which we believe in very strongly: AI is a tool that can empower creative people. Our mission is and will always be to empower creative people around the world.” – Shreyas Nivas, Replica Studios CEO

Like almost every startup, Replica had a humble beginning. Nivas began his startup journey right after graduating from IIT. Through a few failed attempts he learned more about software and product development which ignited a passion in him for AI and deep learning. Using the knowledge from his prior projects and with the help of the contacts he had made, he found himself at the head of an augmented reality and virtual reality consultancy firm. Through this company, he met Replica’s main investor, Stephen Phillips. Phillips is the creator of We Are Hunted which later became Twitter Music.

Phillips quit his job at Twitter and set up an AI lab in Brisbane with the goal of invigorating Australia’s ecosystem and creating world-class AI companies. Likewise, Nivas closed his consultancy and spent half a year learning about AI. The talent attracted by the AI Lab eventually led to the formation of the core team that is Replica. Together, they came up with the idea to start Replica.

How Synthetic Voice Actors Rival Human Actors — An Interview with Replica Studios - Team Replica
The Replica Team

The Replica team is made up of avid gamers who are excited by the possibilities they can offer to creatives around the world. Nivas believes that AI will fuel a natural evolution of creative tools like photoshop and CGI, igniting a shift toward the mainstream use of synthetic media over digital media.


What solutions does Replica provide?

Replica is an AI company. They have recently launched their first product, Replica Studios, which is a creative tool that can replicate any voice using a few minutes of recorded speech.

Nivas: Utilizing advanced deep learning algorithms, Replica provides a platform for creatives from various industries to easily create and edit synthetic voices and dialogue. When we talk about our technology, we don’t see ourselves as a text to speech company. We’re very much focused on entertainment. Entertainment is not about speech, it’s about performance; voice actors are performers. We’ve built our technology from the ground up to learn how to mimic performing style. One aspect of the technology we’re quite proud of is our AI’s ability to speak with a particular emotion or style.

Aside from the basic emotions: anger, sadness, fear, happiness, surprise, and others, Replica also employs nuanced controls for the emotion and intonation of speech. Many of these controls are still in development or testing phases and some are being beta tested by Replica Studios users.

This is an extension of where we aim to be in the future. We envision the core engine that drives our technology as a photoshop for your voice, meaning full editing control over anything you could possibly do with your voice. We’re not there yet, but I believe that the research has come to a stage where it is now product-ready. We just have to implement better models using better datasets and create a good UX.


Synthetic Media for Games and Influencers

Nivas: As gamers, we’d love for stories to become more immersive. That’s where games seem to be going. The Witcher 3 from developer CD Projekt Red is one of the best games of all time. When the game ends you feel as though you’ve aged an entire lifetime. More and more games will try to follow that pattern. Our technology can help these games be more immersive. One factor that can really advance the immersiveness of games is for dialogues to be dynamic, for them to evolve based on your playing style.

Let’s say you’re playing an RPG and you are purchasing some items from a shop, but you don’t have the funds to buy a certain item. Do you want to hear the same rejection message over and over again? How cool would it be if that message evolved with time or the shopkeeper got annoyed that you were wasting their time? We want to create experiences that mimic real life. While those experiences within games are very exciting, Replica’s interests exist outside of the gaming industry as well.

Celebrities and influencers can utilize our technology to reach out and build a deeper relationship with their fans individually. Connecting with fans on a personal level is not something they can do when they have a substantial fanbase. With our technology, they can scale their voice and essentially be in thousands of places at once. Things like that are exciting to us because we’re creating something that can have a huge positive effect.


The Dangers of Synthetic Media

“It’s scary when you think of all the dangerous implications of AI voice technology. It’s an obvious target for hackers, scammers and identity theft.” – Replica

Nivas: A lot of the technology is being open sourced around image synthesis, video synthesis, and of course, audio synthesis. I fear for a future where there is so much rampant misuse of synthetic media that you can no longer tell the difference between what’s real and what’s fake.

I believe that AI is an accelerant to bringing up the problems that exist in society. But again, that is no reason for us to halt the progress of civilization. While Replica happens to be a creative company, there are applications of AI that could save lives. People are right to be worried, but there are countermeasures we can put in place and are already being put in place.

I think the more important question is: how do we figure out a way to differentiate between real and fake? Once we do that, platforms like Youtube, Facebook, and Twitter, can police their content, remove fake content and only allow licensed content to be posted.


How does Replica protect people’s voices?

Nivas: Replica is building a marketplace for the world’s voices. Each individual has full control over their own Replica voice, and they can license their voices at scale using our technology. We take privacy and security very seriously. We are not taking the voices of celebrities and releasing them to the public. Users will only have access to generic voices which are open-sourced. They will also have the ability to create their own voices and collaborate with friends and people they invite.


How does Replica prevent people from uploading unlicensed voices?

Nivas: Firstly, never have a repeatable process. That’s security 101. If there’s a repeatable process, it’s easy to take advantage of. We have systems which randomize the onboarding process. To prevent people from using voices that aren’t their property, we have a set of tools that do basic tests on the voices which users upload. On another level, we have a watermarking ability which can be used to verify whether a voice was made on our platform.

Any new technology is going to have positive and negative effects. I’ve made the case for what Replica stands for: privacy and security. We want people to realize that this is true and to trust us with scaling their voices securely using Replica’s Voice Marketplace.

In the future, Replica plans to partner with platforms like Youtube, Spotify, and Amazon to make sure that these platforms have the ability to differentiate licensed content from illegal content.


When will Replica voices rival performances of real actors?

Avid gamers will know that the quality of the voice actors can make or break story-based video games. While synthetic audio can mimic the sound and intonation of someone’s voice, can it also match or even surpass the acting ability of human performers?

Nivas: In a way, we are already at a point where synthetic voices are indistinguishable from real voices and can rival human performances. If you were to train a model on hours and hours of one person’s voice, I’m 100% sure a lot of people would find it hard to tell the difference. Being able to mimic someone’s performance, their emotion control, their intonation, their delivery and intensity of speech, a lot of these factors we can already model.


Does this mean the end of voice actors?

Nivas: Replica exists to empower creative people. Voice actors are incredibly creative people, and we help them increase their bandwidth, rather than putting them out of business. We value everyone’s voice as being part of their identity and part of their IP. We’re building a marketplace of voices where people can license their likeness and their Replica voices to creative studios around the world.

How Synthetic Voice Actors Rival Human Actors — An Interview with Replica Studios - Replica Studios Logo

With Replica Studios, voice actors will have full control of where and how their voices are used and have the ability to set their own prices.

According to Nivas, the reactions from many voice actors have been very positive. A lot of the actors in the industry do a variety of jobs from video games to commercials, documentaries, and everything in between. Within the gaming industry especially, voice actors often have to create overly raspy tones to match different characters.

Nivas: Instead of taking jobs away from actors, we’re trying to make it even easier for them to make money and reach studios all over the world. Many voice actors said they’d love this product because it would mean they wouldn’t have to physically strain their voices as much. Sometimes they even lose their voices and lose the ability to work for a period of time. This technology could drastically alleviate that.


What is the future of Replica and synthetic media?

Nivas: Replica’s goal is always going to be to empower creative people. At Replica we’re building a marketplace for the world’s voices, which we’ll be doing in stages. We’re looking to speak to artists, celebrities and voice actors who want to be on our platform. It’s a novel approach to making money that we haven’t seen before. We’re already working with artists, celebrities, and well-known companies in the gaming and animation industries. It’s exciting and encouraging to have more people come in day by day. People are curious and it’s very rewarding.

In terms of synthetic media in general, AI is an enabling tool for creative people. I’m really excited not only in the area of voice synthesis, but also in the development of GAN-based neural networks to synthesize images. What does that mean for the next photoshop for images?

Perhaps children will go to school and in art class they’ll be playing with a neural network. That would be amazing. I’m excited about the potential for these technologies to empower creativity, to empower education. Creativity is quite evenly distributed around the world, but opportunity is not. AI is one of those things that levels the playing field in terms of opportunity.


Replica Studios has already launched and there is currently a long waitlist to gain access to the product. The studio is available for free. However, professional functions and additional licensed voices will be available for an extra fee. If you want to access Replica Studios and try the product out yourself, it’s best to sign up early, as waiting times range from a few weeks to over a month.

Want to learn more about Replica? Check out the Replica Studios blog.

In need of AI Training Data? Get high-quality data now
The Author
Limarc Ambalina

Limarc writes content for Lionbridge’s website as part of the marketing team. Born and raised in Canada, Limarc’s love of Japanese pop culture brought him to Japan in 2016 and living in Japan has been his dream come true. Apart from Lionbridge content, you can catch Limarc online writing about anime, video games, and other nerd culture.


    Sign up to our newsletter for fresh developments from the world of training data. Lionbridge brings you interviews with industry experts, dataset collections and more.