Tool allows users to create multiple camea angles from single camera
NVIDIA has added AI virtual cameras and speech recognition to its Holoscan For Media platform.
Holoscan For Media allows live media and video pipelines to run on the same infrastructure as AI, enabling users to access AI applications. It was launched last year.
The company has now added what it describes as AI reference applications for Holoscan, which can interface with uncompressed ST 2110 streams and add AI effects with minimal latency.
One of these is AI virtual cameras, built with PyTorch and NVIDIA DeepStream SDK. This detects and tracks individuals in the stream, and then creates multiple cropped virtual camera outputs focused on those individuals. This means a user can generate multiple AI-generated camera feeds from a single static camera.
Follow AI Media News on Linkedin here, and X here, and sign up to its weekly newsletter here.
In addition, NVIDIA has announced AI speech recognition, which consists of a web user interface that monitors transcription in real time and enables users to search for words. Users see live captions of the incoming stream, along with a search field to search through the transcription.
NVIDIA recommends that people who want to use these tools should have an AI workstation with an NVIDIA RTX Pro GPU and an NVIDIA ConnectX network interface card (with loopback cable or switch connectivity) or a certified multi-GPU system, a functional NVIDIA Holoscan for Media environment using either a local developer setup with Kubernetes or the platform reference deployment guide with a jump node, and a Visual Studio Code or any other IDE for Linux platforms. The GNU Compiler Collection (GCC) can also be used.
No comments yet