Teaching on a Budget in Multi-Agent Deep Reinforcement Learning

التفاصيل البيبلوغرافية
العنوان: Teaching on a Budget in Multi-Agent Deep Reinforcement Learning
المؤلفون: Jeremy Gow, Ercument Ilhan, Diego Perez-Liebana
المصدر: CoG
بيانات النشر: arXiv, 2019.
سنة النشر: 2019
مصطلحات موضوعية: Random graph, FOS: Computer and information sciences, Computer Science - Machine Learning, Computer science, business.industry, Reuse, Machine learning, computer.software_genre, Sequential decision, Machine Learning (cs.LG), Reinforcement learning, Leverage (statistics), Computer Science - Multiagent Systems, Artificial intelligence, Heuristics, business, computer, Drawback, Multiagent Systems (cs.MA)
الوصف: Deep Reinforcement Learning (RL) algorithms can solve complex sequential decision tasks successfully. However, they have a major drawback of having poor sample efficiency which can often be tackled by knowledge reuse. In Multi-Agent Reinforcement Learning (MARL) this drawback becomes worse, but at the same time, a new set of opportunities to leverage knowledge are also presented through agent interactions. One promising approach among these is peer-to-peer action advising through a teacher-student framework. Despite being introduced for single-agent RL originally, recent studies show that it can also be applied to multi-agent scenarios with promising empirical results. However, studies in this line of research are currently very limited. In this paper, we propose heuristics-based action advising techniques in cooperative decentralised MARL, using a nonlinear function approximation based task-level policy. By adopting Random Network Distillation technique, we devise a measurement for agents to assess their knowledge in any given state and be able to initiate the teacher-student dynamics with no prior role assumptions. Experimental results in a gridworld environment show that such an approach may indeed be useful and needs to be further investigated.
Comment: 8 pages
DOI: 10.48550/arxiv.1905.01357
URL الوصول: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::a993ebc10a1e8c903092277c3fbdc659
حقوق: OPEN
رقم الأكسشن: edsair.doi.dedup.....a993ebc10a1e8c903092277c3fbdc659
قاعدة البيانات: OpenAIRE
الوصف
DOI:10.48550/arxiv.1905.01357