[Proof-of-Concept] DreamPose: Fashion Image-to-Video Synthesis via Stable Diffusion
TL;DR Typical diffusion models create images using input text. DreamPose, presented at ECCV 2023, enhances this functionality by generating a video from an image incorporating a human model and pose sequence, as represented by DensePose. Problem statements Common diffusion models able to generate images based on given text. However, they can not produce animated sequence nor able to be conditioned on an input pose sequence. Method Apply the following modifications to a diffusion model:...