• Author(s): Chongjie Ye, Lingteng Qiu, Xiaodong Gu, Qi Zuo, Yushuang Wu, Zilong Dong, Liefeng Bo, Yuliang Xiu, Xiaoguang Han

The paper titled “StableNormal: Reducing Diffusion Variance for Stable and Sharp Normal” introduces a novel approach to improving the stability and sharpness of normal estimates in diffusion models. Diffusion models are widely used in various applications, including image synthesis and denoising, but they often suffer from high variance during the inference process, which can lead to unstable and blurry results.

StableNormal addresses this issue by proposing a method to reduce the variance in the diffusion process, thereby producing more stable and sharper normal estimates. The core idea is to incorporate a stable target field that minimizes the variance of the training targets in the denoising score-matching objective. This approach helps in handling the intermediate noise-variance scales more effectively, where multiple modes in the data can affect the direction of reverse paths. The method involves using a reference batch to calculate weighted conditional scores, which serve as more stable training targets. By doing so, the procedure reduces the covariance of the training targets, trading off some bias for reduced variance. The bias introduced by this method diminishes as the reference batch size increases, ensuring that the model remains accurate while achieving greater stability.

The paper provides extensive experimental results to demonstrate the effectiveness of StableNormal. The authors evaluate their approach on several benchmark datasets and compare it with existing state-of-the-art diffusion models. The results show that StableNormal consistently improves image quality, stability, and training speed across various datasets. When used in combination with existing diffusion models, such as EDM, StableNormal achieves a state-of-the-art Fréchet Inception Distance (FID) score of 1.90 on the unconditional CIFAR-10 generation task with only 35 network evaluations. Additionally, the paper includes qualitative examples that highlight the practical benefits of StableNormal. The generated images exhibit higher visual fidelity and sharper details compared to those produced by traditional diffusion models. These improvements are particularly evident under challenging imaging conditions, where StableNormal maintains stability and sharpness.

“StableNormal: Reducing Diffusion Variance for Stable and Sharp Normal” presents a significant advancement in the field of diffusion models. By reducing the variance in the diffusion process, StableNormal enhances the stability and sharpness of normal estimates, leading to higher quality and more reliable results. This research has important implications for various applications, including image synthesis, denoising, and other tasks that rely on accurate and stable normal estimates.