Spectrally-Guided Diffusion Noise Schedules

2026-03-19Computer Vision and Pattern Recognition

Computer Vision and Pattern RecognitionMachine Learning
AI summary

The authors study how noise levels are set in diffusion models, which are tools for creating images and videos. Usually, these noise settings are made by hand and need tweaking for different image sizes. The authors propose a new method that uses the image’s frequency information to automatically choose better noise settings for each case. Their method reduces unnecessary steps and improves the quality of generated images, especially when fewer steps are used. This approach helps make the generation process more efficient and effective.

denoising diffusion modelsnoise scheduleimage generationspectral propertiespixel diffusioninferencenoise levelssamplinggenerative quality
Authors
Carlos Esteves, Ameesh Makadia
Abstract
Denoising diffusion models are widely used for high-quality image and video generation. Their performance depends on noise schedules, which define the distribution of noise levels applied during training and the sequence of noise levels traversed during sampling. Noise schedules are typically handcrafted and require manual tuning across different resolutions. In this work, we propose a principled way to design per-instance noise schedules for pixel diffusion, based on the image's spectral properties. By deriving theoretical bounds on the efficacy of minimum and maximum noise levels, we design ``tight'' noise schedules that eliminate redundant steps. During inference, we propose to conditionally sample such noise schedules. Experiments show that our noise schedules improve generative quality of single-stage pixel diffusion models, particularly in the low-step regime.