دورية أكاديمية

Target-Oriented Multi-Agent Coordination with Hierarchical Reinforcement Learning.

التفاصيل البيبلوغرافية
العنوان: Target-Oriented Multi-Agent Coordination with Hierarchical Reinforcement Learning.
المؤلفون: Yu, Yuekang, Zhai, Zhongyi, Li, Weikun, Ma, Jianyu
المصدر: Applied Sciences (2076-3417); Aug2024, Vol. 14 Issue 16, p7084, 21p
مصطلحات موضوعية: GOAL (Psychology), GLOBAL method of teaching, REINFORCEMENT learning
مستخلص: In target-oriented multi-agent tasks, agents collaboratively achieve goals defined by specific objects, or targets, in their environment. The key to success is the effective coordination between agents and these targets, especially in dynamic environments where targets may shift. Agents must adeptly adjust to these changes and re-evaluate their target interactions. Inefficient coordination can lead to resource waste, extended task times, and lower overall performance. Addressing this challenge, we introduce the regulatory hierarchical multi-agent coordination (RHMC), a hierarchical reinforcement learning approach. RHMC divides the coordination task into two levels: a high-level policy, assigning targets based on environmental state, and a low-level policy, executing basic actions guided by individual target assignments and observations. Stabilizing RHMC's high-level policy is crucial for effective learning. This stability is achieved by reward regularization, reducing reliance on the dynamic low-level policy. Such regularization ensures the high-level policy remains focused on broad coordination, not overly dependent on specific agent actions. By minimizing low-level policy dependence, RHMC adapts more seamlessly to environmental changes, boosting learning efficiency. Testing demonstrates RHMC's superiority over existing methods in global reward and learning efficiency, highlighting its effectiveness in multi-agent coordination. [ABSTRACT FROM AUTHOR]
Copyright of Applied Sciences (2076-3417) is the property of MDPI and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
قاعدة البيانات: Complementary Index
الوصف
تدمد:20763417
DOI:10.3390/app14167084