KoMA: Knowledge-driven Multi-agent Framework for Autonomous Driving with Large Language Models

التفاصيل البيبلوغرافية
العنوان: KoMA: Knowledge-driven Multi-agent Framework for Autonomous Driving with Large Language Models
المؤلفون: Jiang, Kemou, Cai, Xuan, Cui, Zhiyong, Li, Aoyong, Ren, Yilong, Yu, Haiyang, Yang, Hao, Fu, Daocheng, Wen, Licheng, Cai, Pinlong
سنة النشر: 2024
المجموعة: Computer Science
مصطلحات موضوعية: Computer Science - Artificial Intelligence
الوصف: Large language models (LLMs) as autonomous agents offer a novel avenue for tackling real-world challenges through a knowledge-driven manner. These LLM-enhanced methodologies excel in generalization and interpretability. However, the complexity of driving tasks often necessitates the collaboration of multiple, heterogeneous agents, underscoring the need for such LLM-driven agents to engage in cooperative knowledge sharing and cognitive synergy. Despite the promise of LLMs, current applications predominantly center around single agent scenarios. To broaden the horizons of knowledge-driven strategies and bolster the generalization capabilities of autonomous agents, we propose the KoMA framework consisting of multi-agent interaction, multi-step planning, shared-memory, and ranking-based reflection modules to enhance multi-agents' decision-making in complex driving scenarios. Based on the framework's generated text descriptions of driving scenarios, the multi-agent interaction module enables LLM agents to analyze and infer the intentions of surrounding vehicles, akin to human cognition. The multi-step planning module enables LLM agents to analyze and obtain final action decisions layer by layer to ensure consistent goals for short-term action decisions. The shared memory module can accumulate collective experience to make superior decisions, and the ranking-based reflection module can evaluate and improve agent behavior with the aim of enhancing driving safety and efficiency. The KoMA framework not only enhances the robustness and adaptability of autonomous driving agents but also significantly elevates their generalization capabilities across diverse scenarios. Empirical results demonstrate the superiority of our approach over traditional methods, particularly in its ability to handle complex, unpredictable driving environments without extensive retraining.
Comment: 13 pages, 18 figures
نوع الوثيقة: Working Paper
URL الوصول: http://arxiv.org/abs/2407.14239
رقم الأكسشن: edsarx.2407.14239
قاعدة البيانات: arXiv