Video-to-Video

Video-to-video is an AI workflow that takes existing footage as input and produces a restyled version, used in animation to apply a hand-drawn or painterly look to live-action plates, to convert 3D renders into a different aesthetic, or to refine the output of earlier AI passes. It is the closer relative of rotoscoping, with the rendering work done by a model rather than by hand, frame by frame.
Inside a hybrid pipeline, video-to-video sits late, after a controlled plate exists. A 3D animator blocks the action with clean controllable cameras, the renders go through a video-to-video pass that takes the geometry-true frames into a stylised world, and a compositor lands the result. The same path works for live-action: a reference plate carries the performance, the model carries the look. This split keeps performance under acting for animation discipline and pushes the styling to a faster pipeline.
On hybrid AI projects like LEGS, video-to-video is used to bridge between clean 3D output and a more painterly look that would be slow to build by hand. The structure of the shot is preserved from the original render, while the texture, brushwork, and lighting respond to a style reference. The style transfer approach is a close cousin, focused on still frames; video-to-video adds the temporal dimension and a much higher cost of getting it wrong, because flicker is brutal at twenty-four frames a second.
The honest limits are flicker and drift. Frame-by-frame restyling can produce small inconsistencies that read as flicker, especially at high motion. Production pipelines pair video-to-video with temporal consistency tooling and a clean-up pass to settle the result. The other limit is faithful brand colour: subtle palette work can shift across the pass, so we lock a reference and check colour at compositing. For hero shots, we still favour hand-keyed work through our hybrid AI animation service. A planning note worth naming: the input plate determines what video-to-video can do. A clean, well-lit, controllable 3D pass produces a clean, controllable restyle. A messy plate produces a messy restyle. We treat the plate as the contract, sign it off before the AI pass runs, and lock the cameras so a re-pass against an updated style reference does not destabilise the comp. The same discipline used to lock a storyboard before animation begins applies here: the cheapest moment to change your mind is before the slow stage runs, not during it.
Myth Labs operates production video-to-video for brand and broadcast work where a controllable plate exists and a faster route to the agreed look is needed. For the wider context, see how artists are using AI without losing the craft.
Related
Related concepts
Related services
Frequently asked questions
Is video-to-video the same as a filter or effect?
No. A filter applies a fixed rule to every frame. Video-to-video uses a generative model conditioned on a reference, which means the look can be much further from the source than a filter can take you. A LUT changes colour; video-to-video can take a 3D render to a watercolour world.
When does it earn its place over hand-painted animation?
When the action is performance-led and the look is style-led. A 3D pass nails the character animation and camera, a video-to-video pass nails the look. Hand-painting reaches further visually, but at a different time and cost.
How do you avoid the look flickering?
Three things: a locked style reference, temporal consistency techniques inside the model, and a compositing clean-up pass. For broadcast delivery, the clean-up is non-negotiable.
Sources (6)
Academic papers, recognised industry standards, and canonical industry texts that back up claims in this entry.
- Video Style Transfer: A Unified Algorithm for Global and Local Correspondence-Based Video Editing. Chen, He, Wang, et al., ACM Transactions on Graphics, 2008Supports: video-to-video restyling workflow
- Real-Time User-Guided Image Colorization with Learned Deep Priors. Zhang, Isola, Efros, ACM Transactions on Graphics, 2017Supports: preserves structure while changing style
- Neural Video Portraits. Kim, Zimmermann, Pumarola, et al., ACM Transactions on Graphics, 2018Supports: portrait video relighting and reenactment
- Everybody Dance Now. Chan, Ginosar, Zhou, Efros, arXiv, 2019Supports: source-video pose transfer pipeline
- First Order Motion Model for Image Animation. Siarohin, Lathuilière, Tulyakov, Ricci, Sebe, NeurIPS, 2019Supports: motion transfer from driving video
- StyleBlit: Fast Projection onto Style Surfaces. Fišer, Hanika, Korman, Zwicker, ACM Transactions on Graphics, 2020Supports: video and render stylization pipeline