Go to main content
Formats
Format
BibTeX
MARCXML
TextMARC
MARC
DataCite
DublinCore
EndNote
NLM
RefWorks
RIS

Files

Abstract

Diffusion models have made their mark in image synthesis by excelling in visual quality and flexibility. Theyuse additional negative prompts with classifier-free guidance (CFG), which guides the model in generating images aligned with the user’s intent. However, CFG mandates the model to run twice, making it difficult to interpret the impact of the negative prompt in the final image. My study proposes a method to generate a single prompt producing on-par quality as the two prompts/passes CFG. I have prepared a prompt- to-image dataset, used per-image optimization to find the ground truth single merged prompt for each image, and trained a neural network module to predict the embedding of that prompt. During inference time, my model generates a single prompt with a single diffusion pass, achieving up to 2x speedup and 20% memory reduction. My research contributes to developing more efficient diffusion models and deeply understanding their characteristics.

Details

PDF

Statistics

from
to
Export
Download Full History