RCAgent: Cloud Root Cause Analysis by Autonomous Agents with Tool-Augmented Large Language Models

التفاصيل البيبلوغرافية
العنوان: RCAgent: Cloud Root Cause Analysis by Autonomous Agents with Tool-Augmented Large Language Models
المؤلفون: Wang, Zefan, Liu, Zichuan, Zhang, Yingying, Zhong, Aoxiao, Wang, Jihong, Yin, Fengbin, Fan, Lunting, Wu, Lingfei, Wen, Qingsong
سنة النشر: 2023
المجموعة: Computer Science
مصطلحات موضوعية: Computer Science - Software Engineering, Computer Science - Computation and Language
الوصف: Large language model (LLM) applications in cloud root cause analysis (RCA) have been actively explored recently. However, current methods are still reliant on manual workflow settings and do not unleash LLMs' decision-making and environment interaction capabilities. We present RCAgent, a tool-augmented LLM autonomous agent framework for practical and privacy-aware industrial RCA usage. Running on an internally deployed model rather than GPT families, RCAgent is capable of free-form data collection and comprehensive analysis with tools. Our framework combines a variety of enhancements, including a unique Self-Consistency for action trajectories, and a suite of methods for context management, stabilization, and importing domain knowledge. Our experiments show RCAgent's evident and consistent superiority over ReAct across all aspects of RCA -- predicting root causes, solutions, evidence, and responsibilities -- and tasks covered or uncovered by current rules, as validated by both automated metrics and human evaluations. Furthermore, RCAgent has already been integrated into the diagnosis and issue discovery workflow of the Real-time Compute Platform for Apache Flink of Alibaba Cloud.
Comment: Accepted by the 33rd ACM International Conference on Information and Knowledge Management (CIKM 2024)
نوع الوثيقة: Working Paper
DOI: 10.1145/3627673.3680016
URL الوصول: http://arxiv.org/abs/2310.16340
رقم الأكسشن: edsarx.2310.16340
قاعدة البيانات: arXiv