Joe Lewis at The Voiceover Gallery says that when we embrace AI – not as a replacement, but as a tool – the future doesn’t feel quite as bleak

TVG studio 1

We all know AI is here to stay, but from our perspective the hype and hysteria has somewhat died down over the past few months.

Yes, there have been contractions in certain sectors – it is difficult to compete with AI voice platforms churning out low-budget or long-form dubbing and localisation projects. However, it is worth acknowledging that AI is a tool that can help companies become more productive and, in many cases, assist us in improving our deliverables while also being more cost-effective.

With that in mind, at The Voiceover Gallery, we would argue that a hybrid approach is both pragmatic and beneficial.

Three years ago, I attended an AI in media conference where I was confronted with both the doomsdayer and the evangelist. The panellists had created fully AI-generated projects, declaring that the use of human voices would soon be a thing of the past.

The evangelists were embracing this as it would enable them to produce professional-quality content without the need for large teams; the doomsayers were concerned about the death of the industry as we know it. The reality is somewhere in between.

There was also a common theme at these conferences – the best use cases for AI in production are for dialogue or voiceover. In many ways, it is true. It feels real, it can be quick, and if we consider a sector of our business at The Voiceover Gallery, localisation, it could be a real game-changer in terms of productivity. 

Nonetheless, there was a pragmatism that was being ignored. There is now a certain consensus that humans have experiences and nuances that cannot be generated.

AI cannot recreate the lived experience of an actor with years of training, nor can it take direction or experiment.

In my opinion, there are no LLMs currently available that can get close. I also suspect that, regardless of what we hear, there are no computers with enough processing power to really capture the emotional range of a human.

This begs the question – how useful is AI in dialogue work, particularly in the context of our high-end advertising and animation origination projects? 

To be frank, we don’t use it that often. We are constantly testing platforms, often they don’t deliver, but we do see some excellent use cases.

For example, we sometimes work with children, which is when simple pickups or amendments aren’t always straightforward.

Getting licenses to re-record with children isn’t always a quick process, and, as many who work in production can attest, everything needs to be quick.

Cloning a voice (which at The Voiceover Gallery is always done with permission) can sometimes save time – if it’s just for a couple of words or lines. It can work, and it can be explored.

Most importantly, the general faff that is involved with the endless generations of a voice using an AI platform, is less intimidating if it is a line or two as opposed to a 200-300-word script.

Perhaps the most exciting area for us is the use of AI technology in voice-morphing, as we now have the capacity to use AI in scenarios where we can utilise the same actor for multiple parts in an animation.

This isn’t an unusual practice within the world of voiceover, but this, in my opinion, is where AI genuinely works as a tool for improvement.

When we are directing a voice and trying to dial in the timbre, delivery, and accent of a new character, there are instances when we love what an actor has done, but they are a touch too close to another character voiced by the same actor.

There are platforms that allow engineers and directors to subtly alter the voice’s character far enough away from the more general, unchangeable characteristics of an actor’s voice, helping it appear more original.

There are also some great examples of using voice cloning in the workflow that, though controversial, are incredible.

The voice cloning of Adrian Brody to be able to deliver a clear Hungarian accent is an excellent example. Over the years, I’ve watched plenty of foreign content with native speakers, who can often pinpoint when an English-speaking actor has mastered the language but not the accent.

If we couple this with the idea of reducing the months of language learning, as well as the number of takes required to even get the dialogue to an acceptable place, the result is improved efficiency in production and reduced expense.

Finally, if we continue to look at AI solutions in our productions and workflows, it is also worth pointing out the more mundane ideas of dialogue clean-up. There may be a reduced argument for ADR on certain projects, as we see AI audio clean-up plugins are improving rapidly.

If we embrace AI - not as a replacement, but as a tool - the future doesn’t feel as bleak. Nonetheless, I would advise that anyone incorporating AI into their workflow proceed with caution.

Outside of the threat of retrospective legislation targeting the training models being used, clients are also becoming wary of using any AI-generated content.

It is fair to say that some clients are expressly requesting that it is not used without permission, if at all. With that in mind, treat it like that dubious take – if in doubt, leave it out. 

Joe Studio

Joe Lewis is head of audio at The Voiceover Gallery

Topics