Particle Physics Seminar: Charting the Topography of the Neural Network Landscape with Thermal-Like Noise

Noam Levi, TAU

27 April 2023, 10:00 
Shenkar Building, Melamed Hall 006 
Particle Physics Seminar

Abstract:

The training of neural networks is a challenging optimization problem, and understanding the landscape that guides the optimization process remains an open problem in computer science. In our research, we used Statistical Mechanics methods, including phase-space exploration with Langevin dynamics, to study this landscape for networks whose number of parameters far exceeds the number of data points, performing a classification task on random and real data. By analyzing the fluctuation statistics, we were able to infer a clear geometric description of the convergence region, much like in thermal dynamics at a constant temperature. We discovered that the convergence region is a low-dimensional manifold, and its dimension can be readily obtained from the fluctuations. The number of data points near the classification decision boundary controls this dimension. We also found that a quadratic approximation of the loss near the minimum is inadequate due to the exponential nature of the decision boundary and the flatness of the low-loss region. Our simplified loss model explains this behavior and reproduces the observed fluctuation statistics. I will explain how our findings can have implications for the theoretical understanding of deep learning optimization, especially for some less understood phenomena such as grokking and double descent.

 

Reference: https://arxiv.org/pdf/2304.01335.pdf

 

 

Seminar Organizer: Dr. Adi Ashkanzi

Tel Aviv University makes every effort to respect copyright. If you own copyright to the content contained
here and / or the use of such content is in your opinion infringing, Contact us as soon as possible >>