Video editing

TL;DR The process of video editing can be time-consuming and laborious. Many diffusion-based models for videos either fail to preserve temporal consistency or require significant resources. To address it, the “RAVE” method incorporates a clever trick: it takes video frames and combines them to a “grid image”. Then, the grid image is fed to a diffusion model (+ControlNet) to produce an edited version of the grid image. Reconstructing the video from the edited grid image results in a consistent edited temporal video....

Video editing

[Summary] RAVE: Randomized Noise Shuffling for Fast and Consistent Video Editing with Diffusion Models

[Proof-of-Concept] DreamPose: Fashion Image-to-Video Synthesis via Stable Diffusion