The companies have collaborated on Visual Reasoning technology that allows cameras to understand and interpret live scenes

At NAB, PTZOptics showcased its Visual Reasoning innovation, created in collaboration with Moondream, which understands live video feeds and can interpret and respond to what the camera sees in real-time.
It provides live details such as a continually updated detailed description of scene; the number of people in the scene, and what they look like; the number of ‘thumbs up’ people have done in front of the screen, and so on.
Visual Reasoning has a variety of potential applications, from in-stadia experiences and fan engagement through to automated sports production.
For a football match, for example, Moondream’s vision language model can interpret the live video feed in real-time to understand who has possession, when the ball changes control, and when key moments such as shots or goals happen.
That reasoning layer can then drive intelligent camera movement and automated production actions. The result is a more context-aware approach to sports production, where cameras respond to what is actually happening in the scene.
“PTZOptics has built a strong foundation for intelligent camera workflows, and this collaboration shows how our lightweight visual AI can add a new layer of understanding to live production,” said Jay Allen, CEO, Moondream. “Together, we’re demonstrating a practical step beyond ball tracking, where video can be interpreted and acted on as the action unfolds. Just as importantly, this approach can extend far beyond sports, from tracking the person with the mic to following the person in the red shirt.”
“Visual Reasoning is about giving cameras the context they need to make smarter production decisions,” added Claudia Barbiero, director of global marketing, PTZOptics. “Working with Moondream helps us show what happens when cameras do more than follow the ball. They begin to understand the play and respond to it in real time. Instead of simply detecting motion, the system can interpret the scene more like a producer would, identifying who is carrying the ball and keeping the whole player in frame.”
No comments yet