تقرير
Behavior Constraining in Weight Space for Offline Reinforcement Learning
العنوان: | Behavior Constraining in Weight Space for Offline Reinforcement Learning |
---|---|
المؤلفون: | Swazinna, Phillip, Udluft, Steffen, Hein, Daniel, Runkler, Thomas |
سنة النشر: | 2021 |
المجموعة: | Computer Science |
مصطلحات موضوعية: | Computer Science - Machine Learning |
الوصف: | In offline reinforcement learning, a policy needs to be learned from a single pre-collected dataset. Typically, policies are thus regularized during training to behave similarly to the data generating policy, by adding a penalty based on a divergence between action distributions of generating and trained policy. We propose a new algorithm, which constrains the policy directly in its weight space instead, and demonstrate its effectiveness in experiments. Comment: Accepted at ESANN 2021 |
نوع الوثيقة: | Working Paper |
URL الوصول: | http://arxiv.org/abs/2107.05479 |
رقم الأكسشن: | edsarx.2107.05479 |
قاعدة البيانات: | arXiv |
الوصف غير متاح. |