تقرير
Global Convergence Using Policy Gradient Methods for Model-free Markovian Jump Linear Quadratic Control
العنوان: | Global Convergence Using Policy Gradient Methods for Model-free Markovian Jump Linear Quadratic Control |
---|---|
المؤلفون: | Rathod, Santanu, Bhadu, Manoj, De, Abir |
سنة النشر: | 2021 |
المجموعة: | Computer Science Mathematics |
مصطلحات موضوعية: | Computer Science - Machine Learning, Mathematics - Optimization and Control |
الوصف: | Owing to the growth of interest in Reinforcement Learning in the last few years, gradient based policy control methods have been gaining popularity for Control problems as well. And rightly so, since gradient policy methods have the advantage of optimizing a metric of interest in an end-to-end manner, along with being relatively easy to implement without complete knowledge of the underlying system. In this paper, we study the global convergence of gradient-based policy optimization methods for quadratic control of discrete-time and model-free Markovian jump linear systems (MJLS). We surmount myriad challenges that arise because of more than one states coupled with lack of knowledge of the system dynamics and show global convergence of the policy using gradient descent and natural policy gradient methods. We also provide simulation studies to corroborate our claims. Comment: 42 pages, 3 figures |
نوع الوثيقة: | Working Paper |
URL الوصول: | http://arxiv.org/abs/2111.15228 |
رقم الأكسشن: | edsarx.2111.15228 |
قاعدة البيانات: | arXiv |
الوصف غير متاح. |