Extended critical regimes of deep neural networks

التفاصيل البيبلوغرافية
العنوان: Extended critical regimes of deep neural networks
المؤلفون: Qu, Cheng Kevin, Wardak, Asem, Gong, Pulin
سنة النشر: 2022
المجموعة: Computer Science
Condensed Matter
Statistics
مصطلحات موضوعية: Computer Science - Machine Learning, Condensed Matter - Disordered Systems and Neural Networks, Condensed Matter - Statistical Mechanics, Computer Science - Artificial Intelligence, Statistics - Machine Learning
الوصف: Deep neural networks (DNNs) have been successfully applied to many real-world problems, but a complete understanding of their dynamical and computational principles is still lacking. Conventional theoretical frameworks for analysing DNNs often assume random networks with coupling weights obeying Gaussian statistics. However, non-Gaussian, heavy-tailed coupling is a ubiquitous phenomenon in DNNs. Here, by weaving together theories of heavy-tailed random matrices and non-equilibrium statistical physics, we develop a new type of mean field theory for DNNs which predicts that heavy-tailed weights enable the emergence of an extended critical regime without fine-tuning parameters. In this extended critical regime, DNNs exhibit rich and complex propagation dynamics across layers. We further elucidate that the extended criticality endows DNNs with profound computational advantages: balancing the contraction as well as expansion of internal neural representations and speeding up training processes, hence providing a theoretical guide for the design of efficient neural architectures.
نوع الوثيقة: Working Paper
URL الوصول: http://arxiv.org/abs/2203.12967
رقم الأكسشن: edsarx.2203.12967
قاعدة البيانات: arXiv