Latent diffusion model

Latent Diffusion Model
Original author(s)CompVis
Initial releaseDecember 20, 2021
Repositoryhttps://github.com/CompVis/latent-diffusion
Written inPython
Type
LicenseMIT

The Latent Diffusion Model (LDM)[1] is a diffusion model architecture developed by the CompVis (Computer Vision & Learning)[2] group at LMU Munich.[3]

Introduced in 2015, diffusion models (DMs) are trained with the objective of removing successive applications of noise (commonly Gaussian) on training images. The LDM is an improvement on standard DM by performing diffusion modeling in a latent space, and by allowing self-attention and cross-attention conditioning.

LDMs are widely used in practical diffusion models. For instance, Stable Diffusion versions 1.1 to 2.1 were based on the LDM architecture.[4]

  1. ^ Rombach, Robin; Blattmann, Andreas; Lorenz, Dominik; Esser, Patrick; Ommer, Björn (2022). High-Resolution Image Synthesis With Latent Diffusion Models. The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022. pp. 10684–10695.
  2. ^ "Home". Computer Vision & Learning Group. Retrieved 2024-09-05.
  3. ^ "Stable Diffusion Repository on GitHub". CompVis - Machine Vision and Learning Research Group, LMU Munich. 17 September 2022. Archived from the original on January 18, 2023. Retrieved 17 September 2022.
  4. ^ Alammar, Jay. "The Illustrated Stable Diffusion". jalammar.github.io. Archived from the original on November 1, 2022. Retrieved 2022-10-31.