Internal noise in deep neural networks: interplay of depth, neuron number, and noise injection step

2026-04-09Neural and Evolutionary Computing

Neural and Evolutionary Computing
AI summary

The authors studied how adding random noise inside deep neural networks affects their ability to work well. They looked at two places to add noise: before or after the activation function in each neuron, and found that noise before the activation function is less harmful, especially for additive noise. Noise after the activation function causes more problems, with early layers being most affected due to noise buildup. They also showed that a technique called pooling helps reduce noise and improves performance regardless of where noise is added.

deep neural networksGaussian noiseactivation functionadditive noisemultiplicative noisenoise injectionanalog neural networkspoolinghidden layersnoise filtering
Authors
D. A. Maksimov, V. M. Moskvitin, N. Semenova
Abstract
This paper examines the influence of internal Gaussian noise on the performance of deep feedforward neural networks, focusing on the role of the noise injection stage relative to the activation function. Two scenarios are analyzed: noise introduced before and after the activation function, for both additive and multiplicative noise influence. The case of noise before activation function is similar to perturbations in the input channel of neuron, while the noise introduced after activation function is analogous to noise occurring either within the neuron itself or in its output channel. The types of noise and the method of their introduction were inspired by analog neural networks. The results show that the activation function acts as an effective nonlinear filter of noise. Networks with noise introduced before the activation function consistently achieve higher accuracy than those with noise applied after it, with additive noise being more effectively suppressed in this case. For noise introduced after the activation function, multiplicative noise is less detrimental than additive noise, and earlier hidden layers contribute more significantly to performance degradation due to cumulative noise amplification governed by the statistical properties of subsequent weight matrices. The study also demonstrates that pooling-based noise reduction is effective in both cases when noise is introduced before and after the activation function, consistently improving network performance.