Memorization with neural nets: going beyond the worst case

التفاصيل البيبلوغرافية
العنوان: Memorization with neural nets: going beyond the worst case
المؤلفون: Dirksen, Sjoerd, Finke, Patrick, Genzel, Martin
سنة النشر: 2023
المجموعة: Computer Science
Mathematics
Statistics
مصطلحات موضوعية: Statistics - Machine Learning, Computer Science - Machine Learning, Mathematics - Statistics Theory
الوصف: In practice, deep neural networks are often able to easily interpolate their training data. To understand this phenomenon, many works have aimed to quantify the memorization capacity of a neural network architecture: the largest number of points such that the architecture can interpolate any placement of these points with any assignment of labels. For real-world data, however, one intuitively expects the presence of a benign structure so that interpolation already occurs at a smaller network size than suggested by memorization capacity. In this paper, we investigate interpolation by adopting an instance-specific viewpoint. We introduce a simple randomized algorithm that, given a fixed finite dataset with two classes, with high probability constructs an interpolating three-layer neural network in polynomial time. The required number of parameters is linked to geometric properties of the two classes and their mutual arrangement. As a result, we obtain guarantees that are independent of the number of samples and hence move beyond worst-case memorization capacity bounds. We illustrate the effectiveness of the algorithm in non-pathological situations with extensive numerical experiments and link the insights back to the theoretical results.
نوع الوثيقة: Working Paper
URL الوصول: http://arxiv.org/abs/2310.00327
رقم الأكسشن: edsarx.2310.00327
قاعدة البيانات: arXiv