Stable diffusion from scratch

implementation of generative models based on the research paper

This project involves the implementation of generative models such as text to image, image to image models using the U-Net neural network, CLIP Encoder and the Variational Autoencoder (VAE) for its functioning.

The architecture was implemented based on the research paper present in the repo linked below.

I aim to soon add the notes I have taken while building and learning throughout the implementation of this project and also hope to post a blog on the topic soon.

code available at repo