Explicit Lipschitz Value Estimation Enhances Policy Robustness Against Perturbation

التفاصيل البيبلوغرافية
العنوان: Explicit Lipschitz Value Estimation Enhances Policy Robustness Against Perturbation
المؤلفون: Chen, Xulin, Liu, Ruipeng, Katz, Garrett E.
سنة النشر: 2024
المجموعة: Computer Science
مصطلحات موضوعية: Computer Science - Machine Learning
الوصف: In robotic control tasks, policies trained by reinforcement learning (RL) in simulation often experience a performance drop when deployed on physical hardware, due to modeling error, measurement error, and unpredictable perturbations in the real world. Robust RL methods account for this issue by approximating a worst-case value function during training, but they can be sensitive to approximation errors in the value function and its gradient before training is complete. In this paper, we hypothesize that Lipschitz regularization can help condition the approximated value function gradients, leading to improved robustness after training. We test this hypothesis by combining Lipschitz regularization with an application of Fast Gradient Sign Method to reduce approximation errors when evaluating the value function under adversarial perturbations. Our empirical results demonstrate the benefits of this approach over prior work on a number of continuous control benchmarks.
نوع الوثيقة: Working Paper
URL الوصول: http://arxiv.org/abs/2404.13879
رقم الأكسشن: edsarx.2404.13879
قاعدة البيانات: arXiv