Using Large Language Models to Automate and Expedite Reinforcement Learning with Reward Machine

التفاصيل البيبلوغرافية
العنوان:	Using Large Language Models to Automate and Expedite Reinforcement Learning with Reward Machine
المؤلفون:	Alsadat, Shayan Meshkat, Gaglione, Jean-Raphael, Neider, Daniel, Topcu, Ufuk, Xu, Zhe
سنة النشر:	2024
المجموعة:	Computer Science
مصطلحات موضوعية:	Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Computation and Language
الوصف:	We present LARL-RM (Large language model-generated Automaton for Reinforcement Learning with Reward Machine) algorithm in order to encode high-level knowledge into reinforcement learning using automaton to expedite the reinforcement learning. Our method uses Large Language Models (LLM) to obtain high-level domain-specific knowledge using prompt engineering instead of providing the reinforcement learning algorithm directly with the high-level knowledge which requires an expert to encode the automaton. We use chain-of-thought and few-shot methods for prompt engineering and demonstrate that our method works using these approaches. Additionally, LARL-RM allows for fully closed-loop reinforcement learning without the need for an expert to guide and supervise the learning since LARL-RM can use the LLM directly to generate the required high-level knowledge for the task at hand. We also show the theoretical guarantee of our algorithm to converge to an optimal policy. We demonstrate that LARL-RM speeds up the convergence by 30% by implementing our method in two case studies.
نوع الوثيقة:	Working Paper
URL الوصول:	http://arxiv.org/abs/2402.07069
رقم الأكسشن:	edsarx.2402.07069
قاعدة البيانات:	arXiv

الوصف
الوصف غير متاح.