Behavior Constraining in Weight Space for Offline Reinforcement Learning

التفاصيل البيبلوغرافية
العنوان: Behavior Constraining in Weight Space for Offline Reinforcement Learning
المؤلفون: Swazinna, Phillip, Udluft, Steffen, Hein, Daniel, Runkler, Thomas
سنة النشر: 2021
المجموعة: Computer Science
مصطلحات موضوعية: Computer Science - Machine Learning
الوصف: In offline reinforcement learning, a policy needs to be learned from a single pre-collected dataset. Typically, policies are thus regularized during training to behave similarly to the data generating policy, by adding a penalty based on a divergence between action distributions of generating and trained policy. We propose a new algorithm, which constrains the policy directly in its weight space instead, and demonstrate its effectiveness in experiments.
Comment: Accepted at ESANN 2021
نوع الوثيقة: Working Paper
URL الوصول: http://arxiv.org/abs/2107.05479
رقم الأكسشن: edsarx.2107.05479
قاعدة البيانات: arXiv