On the continuity and smoothness of the value function in reinforcement learning and optimal control

التفاصيل البيبلوغرافية
العنوان: On the continuity and smoothness of the value function in reinforcement learning and optimal control
المؤلفون: Harder, Hans, Peitz, Sebastian
سنة النشر: 2024
المجموعة: Computer Science
مصطلحات موضوعية: Electrical Engineering and Systems Science - Systems and Control, Computer Science - Artificial Intelligence, 37H99, 37N35, 93E03, I.2.8
الوصف: The value function plays a crucial role as a measure for the cumulative future reward an agent receives in both reinforcement learning and optimal control. It is therefore of interest to study how similar the values of neighboring states are, i.e., to investigate the continuity of the value function. We do so by providing and verifying upper bounds on the value function's modulus of continuity. Additionally, we show that the value function is always H\"older continuous under relatively weak assumptions on the underlying system and that non-differentiable value functions can be made differentiable by slightly "disturbing" the system.
نوع الوثيقة: Working Paper
URL الوصول: http://arxiv.org/abs/2403.14432
رقم الأكسشن: edsarx.2403.14432
قاعدة البيانات: arXiv