This is the official PyTorch implementation of Breaking the Lock-in: Diversifying Text-to-Image Generation via Representation Modulation, published at ICML 2026.
Dahee Kwon · Haeun Lee · Jaesik Choi
Recent text-to-image models produce high-quality, prompt-aligned images, but samples generated from the same prompt often look overly similar. We investigate this homogeneity by analyzing intermediate Transformer features and find that their spatial average, or DC component, quickly becomes similar across different seeds early in generation. This early convergence locks generation trajectories into similar outcomes and reduces later variation. Motivated by this, we propose DC Attenuation for diVersity Enhancement (DAVE), a training-free representation-level intervention that attenuates the early DC component. DAVE improves prompt-consistent diversity with negligible overhead while maintaining image quality.
Follow the steps below to prepare the environment and install DAVE.
Create and activate the required Conda environment:
conda env create -f environment.yml
conda activate DAVE
To use our source code, clone the diffusers repository to your local workspace:
git clone https://github.com/huggingface/diffusers.git
Overwrite the standard diffusers source files with the customized DAVE files, and then install the package from source:
# Copy custom DAVE scripts into the diffusers directory
cp diffusers_dave/pipeline_stable_diffusion_3.py diffusers/src/diffusers/pipelines/stable_diffusion_3/pipeline_stable_diffusion_3.py
cp diffusers_dave/transformer_sd3.py diffusers/src/diffusers/models/transformers/transformer_sd3.py
# Install from source
cd diffusers
pip install -e .
If you want to apply this method to models beyond Stable Diffusion 3, simply adapt the code sections marked below into your target model's codebase.
# ========================= DAVE CHANGE START =========================
...
# ========================== DAVE CHANGE END ==========================
💡 Note: We will be releasing custom-dave versions for other models shortly!
Now you are ready to play with DAVE!
You can interactively explore our method using the provided Jupyter notebook:
If you find this repo useful, please cite our paper:
@article{kwon2026breaking,
title={Breaking the Lock-in: Diversifying Text-to-Image Generation via Representation Modulation},
author={Kwon, Dahee and Lee, Haeun and Choi, Jaesik},
journal={arXiv preprint arXiv:2606.06813},
year={2026}
}
For questions, research discussions, or collaboration opportunities, please feel free to contact Dahee Kwon

