Access services experts talk to David Wood about the latest efforts to improve accuracy and reduce latency


One of the biggest areas of improvement in the past decade of television has been access services – better known as subtitling, audio description and signing.

In the UK, access services have been steadily improving in line with the requirements set down by regulator Ofcom. The aim is that all channels will have subtitles on 80% of output, audio description (additional commentary that describes the action taking place for the benefit of blind and partially sighted people) on 10% of output and signing on 5%.

The requirements for PSBs are even higher, with the BBC required to subtitle 100% of its output.

So far, most of the major UK channels have exceeded expectations, and the level of satisfaction with access services is high among the 1 million people who use them.

But, as Ofcom has pointed out in a series of recent reports on the quality of subtitles, latency (the delay between the audio and the appearance of corresponding subtitles) and poor accuracy have prompted complaints.

Ofcom reports that latency is one of the most frustrating aspects of live subtitling for audiences, often resulting in a disjointed experience.

But talk to the companies responsible for subtitling technology and those who use it and it is clear that while they have taken note of Ofcom’s findings and are working with pressure groups to continually improve provision, a reality check is required – particularly when it comes to live and fast-turnaround production.

“People have come to expect subtitling as a right, which is fine. But it can lead to the expectation that subtitles can be 100% accurate with zero latency – which is a bit unrealistic,” says Ericsson Broadcast and Media Services head of access services David Padmore.

Andrew Lambourne, business development director at Screen Systems, which makes subtitling tools including live subtitling system WinCAPS Q-Live, playout system Polistream and MediaMate for file-based subtitling workflows, says there needs to be greater appreciation of the difficulties of the job.

“To successfully subtitle live, you have to get into a Zen-like trance where your entire focus is listening and translating into clear speech whatever you hear. But you also have to have your wits about you, raising subtitles up if there is a caption, and adding colour to distinguish between speakers.”

To do a decent job, Lambourne recommends spending five times the running time of the show on subtitles, which might be difficult to achieve with live news and sport.

Latency is the biggest issue with live subtitling, because it has a significant impact on the viewers’ perception of quality. One solution used by broadcasters such as Belgian channel VRT is to introduce a short time-delay into broadcasts, so that Ericsson can provide synchronous subtitles.

But as Padmore points out, delaying broadcasts brings complications of its own. “Ofcom has talked to broadcasters in the UK about delaying live programming but there are significant technical, engineering and audience trust issues – particularly in live sport, where betting might be involved.”

BTI Studios’ marketing manager Corinna Hubbard, who is a former subtitler, says there has already been a lot of progress towards minimising latency through the use of re-speaking technology, such as Nuance’s Dragon NaturallySpeaking (DNS), a system that many access services providers, including BTI and Deluxe, now use.

Deluxe head of media access services Margaret Lazenby explains why: “Voice recognition technology such as DNS converts speech into text. Each subtitler builds a voice profile unique to their speech patterns, which is integrated with Screen’s Q-Live subtitling platform, which delivers the text to the subtitle inserter at transmission via IP.”

Padmore heads up Ericsson’s 500-strong team that supplies access services for major broadcasters in the UK, France, Germany, Spain, The Netherlands, Australia and the US. He says close collaboration between a producer and subtitler is key.

“When you have live programmes with prepared elements – such as news packages – a lot of information is available in newsroom systems such as ENPS and iNews, which can be used to prepare subtitles in advance. Once you have the Autocue script, you have quite a lot of the output that you can deliver accurately in block subtitles, which are easier to read than scrolling subtitles, but harder to produce with minimum latency.”

Lazenby says that when Deluxe receives material close to TX, it is split between up to 10 subtitlers, who will each produce untimed subtitles for a small section of the programme. This will then be keyed out live in two-line blocks.

One genre where speaker-dependent voice recognition can fall down is sport. Padmore says this is a particularly tough genre, with its own syntax and grammar. Preparation – including researching and keying information such as players’ names into the speech recognition platform – is essential. Using re-speakers who are experts in particular sports is also a benefit, says Padmore, whose team subtitles output for Sky Sports and BT Sport.

The process could be speeded up further with the use of speakerindependent speech recognition, says Lambourne. Here, WAV files are transcribed automatically and an XML file is produced that can then be checked for errors by the subtitler.

Checking is important because automated re-speaking systems have been known to produce unintentionally hilarious translations. “Independent speech recognition can be faster and more accurate, but it depends on having a clean audio feed rather than the whole soundtrack,” says Lambourne.

“Unfortunately, there seems to be an obsession with creating great audio, then obscuring it with inane music so you can’t hear what people are saying.” Producers could be incentivised by the possibility of later delivery deadlines if a programme is accompanied by clean audio.

Striving for better subtitling is worth the effort and cost, insists Lambourne. Estimates suggest that good subtitles can increase audiences by 10% – good value when you consider that they are relatively cheap to produce. “Subtitles can be used in banks, waiting rooms and pubs; they are useful to a much wider range of people than the deaf and hard of hearing.”

Future of subtitles

The European Broadcasting Union is leading the way with developments such as HBB TV, a standard for connected-TVs and set-top boxes (STB) that could give viewers the option of synchronising audio and subtitles within the STB.

The EBU is already well advanced with EBU TT, a new subtitling standard in the form of a Time Text XML format that is agnostic when it comes to the presentation layer.

Lack of standardisation is particularly common among the online players, which can cause problems for subtitle service providers. “There is a need for subtitles to be multiplatform,” says Ericsson’s Padmore, who oversees supply to services for the BBC’s iPlayer and Channel 4’s All4.

Deluxe’s Lazenby adds: “New standards such as EBU TT may help simplify delivery, with the possibility that one file could be created and used on multiple platforms.”

One big plus for EBU TT is that it is a rich XML format, which means it can be used to add in more information, from scene changes to speaker ID, location and music, says Padmore.

“Future development is not just about subtitling providing access to the viewer, but also how to enhance the value of the subtitle file to the media owner.”