The new model enables studios and production companies to transcribe content offline with near-cloud accuracy

Screenshot 2026-04-21 at 17.38.44

Speech-to-text transcription specialist Speechmatics has launched an on-device speech model that ships inside Adobe Premiere.

It processes an hour of video in around 55 seconds, entirely offline, with nothing uploading to the cloud.

Accuracy is within 5% of cloud-based transcription. 

Studios, agencies, and production companies handling content before it goes public can now work seamlessly from anywhere at full accuracy, with no dependency on a connection and no interruption to the work, says Speechmatics. 

FOLLOW AI MEDIA NEWS on LinkedIn for regular free stories about AI’s use in media and the creative industries 

The Speechmatics on-device model has been trained on millions of hours of speech to deliver high accuracy for accented speech, non-native speakers, and noisy environments like field reporting or film sets.

It runs on Windows and Mac, making use of the latest AI acceleration techniques to ensure efficient processing across a range of hardware, including broad hardware support for the latest Mac M5, NVIDIA RTX, AMD GPUs and older hardware such as Intel Macs

“Adobe’s global creator community speaks hundreds of languages and dialects. Since 2021, our partnership has focused on making sure speech technology works for everyone - whether you’re editing in Scottish English, Mexican Spanish, or Cantonese. Today, millions of users can benefit from accurate transcription that works anywhere – on-device for privacy, and in the cloud for scale – without compromising performance. As Adobe builds toward LLM-powered creative workflows, having a speech foundation that truly understands diverse voices becomes even more critical. We’re proud to be part of that future,” said Katy Wigdahl, CEO, Speechmatics.

Topics