Spectrally-Guided Diffusion Noise Schedules

2026-03-19 • Computer Vision and Pattern Recognition

Computer Vision and Pattern RecognitionMachine Learning

AI summaryⓘ

The authors study how noise levels are set in diffusion models, which are tools for creating images and videos. Usually, these noise settings are made by hand and need tweaking for different image sizes. The authors propose a new method that uses the image’s frequency information to automatically choose better noise settings for each case. Their method reduces unnecessary steps and improves the quality of generated images, especially when fewer steps are used. This approach helps make the generation process more efficient and effective.

denoising diffusion modelsnoise scheduleimage generationspectral propertiespixel diffusioninferencenoise levelssamplinggenerative quality

Authors

Carlos Esteves, Ameesh Makadia

Abstract

Denoising diffusion models are widely used for high-quality image and video generation. Their performance depends on noise schedules, which define the distribution of noise levels applied during training and the sequence of noise levels traversed during sampling. Noise schedules are typically handcrafted and require manual tuning across different resolutions. In this work, we propose a principled way to design per-instance noise schedules for pixel diffusion, based on the image's spectral properties. By deriving theoretical bounds on the efficacy of minimum and maximum noise levels, we design ``tight'' noise schedules that eliminate redundant steps. During inference, we propose to conditionally sample such noise schedules. Experiments show that our noise schedules improve generative quality of single-stage pixel diffusion models, particularly in the low-step regime.

View PDFOpen arXiv