Download PDFOpen PDF in browserHiPA: Enabling One-Step Text-to-Image Diffusion via High-Frequency Promotion10 pages•Published: April 19, 2026AbstractDiffusion models have revolutionized text-to-image generation, but their real-world applications are hampered by the extensive inference time needed for hundreds of diffusion steps. Although progressive distillation and consistency distillation have been proposed to speed up diffusion sampling to 2-8 steps, they still fall short in one-step generation due to poor abilities to generate high-frequency content. To overcome this issue, we introduce High-frequency-Promoting Adaptation (HiPA), a parameter-efficient approach to enable one-step text-to-image diffusion. Grounded in the insight that high-frequency information is essential but highly lacking in one-step diffusion, HiPA focuses on training one-step, low-rank adaptors to specifically enhance the under-represented high-frequency abilities of advanced diffusion models. The learned adaptors empower these diffusion models to generate high-quality images in just a single step. Compared with progressive distillation, HiPA achieves much better performance in one-step text-to-image generation (37.3 -> 23.8 in FID-5k on MS-COCO 2017) and 28.6x training speed-up (108.8 -> 3.8 A100 GPU days), requiring only 0.04% training parameters (7,740 million -> 3.3 million). We also demonstrate HiPA's effectiveness in text-guided image editing, inpainting and super-resolution tasks, where our adapted models consistently deliver high-quality outputs in just one diffusion step.Keyphrases: high frequency promotion, one step diffusion, text to image generation In: Jernej Masnec, Hamid Reza Karimian, Parisa Kordjamshidi and Yan Li (editors). Proceedings of AI for Accelerated Research Symposium, vol 3, pages 1-10.
|

