Bayes Complexity of Learners vs Overfitting

التفاصيل البيبلوغرافية
العنوان: Bayes Complexity of Learners vs Overfitting
المؤلفون: Głuch, Grzegorz, Urbanke, Rudiger
سنة النشر: 2023
المجموعة: Computer Science
Statistics
مصطلحات موضوعية: Computer Science - Machine Learning, Statistics - Machine Learning
الوصف: We introduce a new notion of complexity of functions and we show that it has the following properties: (i) it governs a PAC Bayes-like generalization bound, (ii) for neural networks it relates to natural notions of complexity of functions (such as the variation), and (iii) it explains the generalization gap between neural networks and linear schemes. While there is a large set of papers which describes bounds that have each such property in isolation, and even some that have two, as far as we know, this is a first notion that satisfies all three of them. Moreover, in contrast to previous works, our notion naturally generalizes to neural networks with several layers. Even though the computation of our complexity is nontrivial in general, an upper-bound is often easy to derive, even for higher number of layers and functions with structure, such as period functions. An upper-bound we derive allows to show a separation in the number of samples needed for good generalization between 2 and 4-layer neural networks for periodic functions.
نوع الوثيقة: Working Paper
URL الوصول: http://arxiv.org/abs/2303.07874
رقم الأكسشن: edsarx.2303.07874
قاعدة البيانات: arXiv