The creative use of generative AI in post-production

No comments

Graham McGuinness at Jigsaw24 Media explains how AI is already helping post-production with creating technical grades and post-producing scripted content

AI image 1

The rate of change occurring within AI is simply staggering. Its blistering pace is reflected in hours or days, not months or years, and the effect it will have on the tech landscape in every sector is hard to over-emphasise.

What is being very effectively leveraged in production and post-production right now are AI-trained capabilities like speech-to-text, automated tagging and facial recognition.

Object recognition or computer vision is also developing at a rapid rate and becoming phenomenally powerful thanks to ever-growing training datasets and improved object detection algorithms.

When it comes to generative AI, synthesised speech has made massive strides in recent times and text-to-image generation systems are developing rapidly with newer versions of Midjourney, Stable Diffusion, craiyon and Dall-E evolving.

We are now also seeing the solid beginnings of generative video using machine learning emerging from a number of research teams. Runway ML is one of the companies at the forefront of this field offering useful tools like colourise, object removal, text-to-colour, grade depth extract and many more.

But what does this mean for the creative space? While it provides a rich canvas for experimentation, we’re still a little way away from tools like these having a real impact on most media and entertainment workflows.

This is, in part, because the creative toolsets are generally constrained from the creative user perspective, the process can be slow, and the results are not always consistent. At this point, it’s a mixture of prompt and programming skills as well as image processing experience from advanced users that are producing this content. There are also significant concerns around societal harm from the production of fake content, and systems like Google brain’s IMAGEN are available only to small teams of researchers.

That being said, the launch of Adobe Firefly as a closed beta has shown some compelling progress in bringing more of a design-oriented focus and control to the creative palette, and some of the image correction capabilities of products like fylm.ai and Colourlab are nothing less than amazing.

Using AI to set a technical grade would be a compelling time saving starting point for most creative colourists, and these tools will potentially improve the quality of quick-turnaround programs without increasing post-production costs. With the ability to apply LUTs already available in some generative AI software, this is almost definitely an area that will see significant developments over the short term.

Generative AI could also be what drives the next step-change in the NLE space. While creative edits are likely to remain in the hands of human editors for the near future, it will be interesting to see how the technology can be used to post-produce scripted content. With tools like ScriptSync already available (and text-based editing from Adobe in beta) it’s easy to imagine how a script, editors notes and storyboards could be fed into a generative AI application along with the rushes to produce a first pass edit, guided by a curated structure.

However, any AI-driven process still needs to be assessed by humans to find anomalies or error, and there are potential questions and challenges around content security and copyright protection. The net result is that tasks that deploy these tools will initially either be non-critical, or able to be corrected ahead of wider consumption. We also shouldn’t underestimate the impact that generative AI can have on making the broader industry more efficient – beyond direct production or post-production workflows. For example, many teams now use ChatGPT (and now GPT4) to write code (as an intelligent assistant), to debug configuration setups and help write comprehensive knowledgebase answers to technical support questions, as well as frame content in ways that allow knowledge workers to take that output and rewrite or repurpose it to their needs.

A distinct trend over the last six months is that we are now seeing many of these services become subscription-based or chargeable for use in terms of processing – an indication that we’re moving beyond the research and testing phase and potentially into commercial models. What is clear is that this is just the beginning of a fundamental structural shift in how technology works and how we as humans use it.

graham_mcguinness_head