تقرير
Towards mental time travel: a hierarchical memory for reinforcement learning agents
العنوان: | Towards mental time travel: a hierarchical memory for reinforcement learning agents |
---|---|
المؤلفون: | Lampinen, Andrew Kyle, Chan, Stephanie C. Y., Banino, Andrea, Hill, Felix |
المصدر: | Advances in Neural Information Processing Systems, 2021 |
سنة النشر: | 2021 |
المجموعة: | Computer Science |
مصطلحات موضوعية: | Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Neural and Evolutionary Computing, I.2.6 |
الوصف: | Reinforcement learning agents often forget details of the past, especially after delays or distractor tasks. Agents with common memory architectures struggle to recall and integrate across multiple timesteps of a past event, or even to recall the details of a single timestep that is followed by distractor tasks. To address these limitations, we propose a Hierarchical Chunk Attention Memory (HCAM), which helps agents to remember the past in detail. HCAM stores memories by dividing the past into chunks, and recalls by first performing high-level attention over coarse summaries of the chunks, and then performing detailed attention within only the most relevant chunks. An agent with HCAM can therefore "mentally time-travel" -- remember past events in detail without attending to all intervening events. We show that agents with HCAM substantially outperform agents with other memory architectures at tasks requiring long-term recall, retention, or reasoning over memory. These include recalling where an object is hidden in a 3D environment, rapidly learning to navigate efficiently in a new neighborhood, and rapidly learning and retaining new object names. Agents with HCAM can extrapolate to task sequences much longer than they were trained on, and can even generalize zero-shot from a meta-learning setting to maintaining knowledge across episodes. HCAM improves agent sample efficiency, generalization, and generality (by solving tasks that previously required specialized architectures). Our work is a step towards agents that can learn, interact, and adapt in complex and temporally-extended environments. Comment: NeurIPS 2021; 10 pages main text; 29 pages total |
نوع الوثيقة: | Working Paper |
URL الوصول: | http://arxiv.org/abs/2105.14039 |
رقم الأكسشن: | edsarx.2105.14039 |
قاعدة البيانات: | arXiv |
الوصف غير متاح. |