SUR (Semantic Understanding & Reasoning Adapter) for Text to Image Diffusion Models
CS 726: Advanced Machine Learning, Prof. Sunita Sarawagi
Semantic Enhancement of Text to Image Diffusion Models
In this project, we explored and experimented with advanced research papers focused on enhancing the semantic understanding of text-to-image generators. Specifically, we delved into:
  
 
 
- SUR-adapter: Enhancing Text-to-Image Pre-trained Diffusion Models with Large Language Models (LLMs)
- ELLA: Equipping Diffusion Models with LLMs for Enhanced Semantic Alignment
Our work involved devising architectural changes and modifications to these implementations, aiming to improve the semantic understanding of the stable diffusion pipeline. We meticulously compared our enhanced models with the vanilla implementations to evaluate performance improvements.