Multi-Bellman operator for convergence of $Q$-learning with linear function approximation

التفاصيل البيبلوغرافية
العنوان: Multi-Bellman operator for convergence of $Q$-learning with linear function approximation
المؤلفون: Carvalho, Diogo S., Santos, Pedro A., Melo, Francisco S.
سنة النشر: 2023
المجموعة: Computer Science
مصطلحات موضوعية: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
الوصف: We study the convergence of $Q$-learning with linear function approximation. Our key contribution is the introduction of a novel multi-Bellman operator that extends the traditional Bellman operator. By exploring the properties of this operator, we identify conditions under which the projected multi-Bellman operator becomes contractive, providing improved fixed-point guarantees compared to the Bellman operator. To leverage these insights, we propose the multi $Q$-learning algorithm with linear function approximation. We demonstrate that this algorithm converges to the fixed-point of the projected multi-Bellman operator, yielding solutions of arbitrary accuracy. Finally, we validate our approach by applying it to well-known environments, showcasing the effectiveness and applicability of our findings.
نوع الوثيقة: Working Paper
URL الوصول: http://arxiv.org/abs/2309.16819
رقم الأكسشن: edsarx.2309.16819
قاعدة البيانات: arXiv