Global Convergence Using Policy Gradient Methods for Model-free Markovian Jump Linear Quadratic Control

التفاصيل البيبلوغرافية
العنوان: Global Convergence Using Policy Gradient Methods for Model-free Markovian Jump Linear Quadratic Control
المؤلفون: Rathod, Santanu, Bhadu, Manoj, De, Abir
سنة النشر: 2021
المجموعة: Computer Science
Mathematics
مصطلحات موضوعية: Computer Science - Machine Learning, Mathematics - Optimization and Control
الوصف: Owing to the growth of interest in Reinforcement Learning in the last few years, gradient based policy control methods have been gaining popularity for Control problems as well. And rightly so, since gradient policy methods have the advantage of optimizing a metric of interest in an end-to-end manner, along with being relatively easy to implement without complete knowledge of the underlying system. In this paper, we study the global convergence of gradient-based policy optimization methods for quadratic control of discrete-time and model-free Markovian jump linear systems (MJLS). We surmount myriad challenges that arise because of more than one states coupled with lack of knowledge of the system dynamics and show global convergence of the policy using gradient descent and natural policy gradient methods. We also provide simulation studies to corroborate our claims.
Comment: 42 pages, 3 figures
نوع الوثيقة: Working Paper
URL الوصول: http://arxiv.org/abs/2111.15228
رقم الأكسشن: edsarx.2111.15228
قاعدة البيانات: arXiv