Audio Conditioned LipSync with Latent Diffusion Models
3D/4D Scenes from a Single Image w/ Controllable Video Diff