تقرير
Bilevel Optimization with a Lower-level Contraction: Optimal Sample Complexity without Warm-start
العنوان: | Bilevel Optimization with a Lower-level Contraction: Optimal Sample Complexity without Warm-start |
---|---|
المؤلفون: | Grazzi, Riccardo, Pontil, Massimiliano, Salzo, Saverio |
المصدر: | Journal of Machine Learning Research, volume 24, number 167, pages 1-37, year 2023 |
سنة النشر: | 2022 |
المجموعة: | Computer Science Mathematics Statistics |
مصطلحات موضوعية: | Statistics - Machine Learning, Computer Science - Machine Learning, Mathematics - Optimization and Control |
الوصف: | We analyse a general class of bilevel problems, in which the upper-level problem consists in the minimization of a smooth objective function and the lower-level problem is to find the fixed point of a smooth contraction map. This type of problems include instances of meta-learning, equilibrium models, hyperparameter optimization and data poisoning adversarial attacks. Several recent works have proposed algorithms which warm-start the lower-level problem, i.e.~they use the previous lower-level approximate solution as a staring point for the lower-level solver. This warm-start procedure allows one to improve the sample complexity in both the stochastic and deterministic settings, achieving in some cases the order-wise optimal sample complexity. However, there are situations, e.g., meta learning and equilibrium models, in which the warm-start procedure is not well-suited or ineffective. In this work we show that without warm-start, it is still possible to achieve order-wise (near) optimal sample complexity. In particular, we propose a simple method which uses (stochastic) fixed point iterations at the lower-level and projected inexact gradient descent at the upper-level, that reaches an $\epsilon$-stationary point using $O(\epsilon^{-2})$ and $\tilde{O}(\epsilon^{-1})$ samples for the stochastic and the deterministic setting, respectively. Finally, compared to methods using warm-start, our approach yields a simpler analysis that does not need to study the coupled interactions between the upper-level and lower-level iterates. Comment: Corrected Remark 18 + other small edits. Code at https://github.com/CSML-IIT-UCL/bioptexps |
نوع الوثيقة: | Working Paper |
URL الوصول: | http://arxiv.org/abs/2202.03397 |
رقم الأكسشن: | edsarx.2202.03397 |
قاعدة البيانات: | arXiv |
الوصف غير متاح. |