تقرير
Extended critical regimes of deep neural networks
العنوان: | Extended critical regimes of deep neural networks |
---|---|
المؤلفون: | Qu, Cheng Kevin, Wardak, Asem, Gong, Pulin |
سنة النشر: | 2022 |
المجموعة: | Computer Science Condensed Matter Statistics |
مصطلحات موضوعية: | Computer Science - Machine Learning, Condensed Matter - Disordered Systems and Neural Networks, Condensed Matter - Statistical Mechanics, Computer Science - Artificial Intelligence, Statistics - Machine Learning |
الوصف: | Deep neural networks (DNNs) have been successfully applied to many real-world problems, but a complete understanding of their dynamical and computational principles is still lacking. Conventional theoretical frameworks for analysing DNNs often assume random networks with coupling weights obeying Gaussian statistics. However, non-Gaussian, heavy-tailed coupling is a ubiquitous phenomenon in DNNs. Here, by weaving together theories of heavy-tailed random matrices and non-equilibrium statistical physics, we develop a new type of mean field theory for DNNs which predicts that heavy-tailed weights enable the emergence of an extended critical regime without fine-tuning parameters. In this extended critical regime, DNNs exhibit rich and complex propagation dynamics across layers. We further elucidate that the extended criticality endows DNNs with profound computational advantages: balancing the contraction as well as expansion of internal neural representations and speeding up training processes, hence providing a theoretical guide for the design of efficient neural architectures. |
نوع الوثيقة: | Working Paper |
URL الوصول: | http://arxiv.org/abs/2203.12967 |
رقم الأكسشن: | edsarx.2203.12967 |
قاعدة البيانات: | arXiv |
الوصف غير متاح. |