Koopman-Assisted Reinforcement Learning

التفاصيل البيبلوغرافية
العنوان:	Koopman-Assisted Reinforcement Learning
المؤلفون:	Rozwood, Preston, Mehrez, Edward, Paehler, Ludger, Sun, Wen, Brunton, Steven L.
سنة النشر:	2024
المجموعة:	Computer Science Mathematics
مصطلحات موضوعية:	Computer Science - Artificial Intelligence, Computer Science - Machine Learning, Mathematics - Dynamical Systems, Mathematics - Optimization and Control
الوصف:	The Bellman equation and its continuous form, the Hamilton-Jacobi-Bellman (HJB) equation, are ubiquitous in reinforcement learning (RL) and control theory. However, these equations quickly become intractable for systems with high-dimensional states and nonlinearity. This paper explores the connection between the data-driven Koopman operator and Markov Decision Processes (MDPs), resulting in the development of two new RL algorithms to address these limitations. We leverage Koopman operator techniques to lift a nonlinear system into new coordinates where the dynamics become approximately linear, and where HJB-based methods are more tractable. In particular, the Koopman operator is able to capture the expectation of the time evolution of the value function of a given system via linear dynamics in the lifted coordinates. By parameterizing the Koopman operator with the control actions, we construct a ``Koopman tensor'' that facilitates the estimation of the optimal value function. Then, a transformation of Bellman's framework in terms of the Koopman tensor enables us to reformulate two max-entropy RL algorithms: soft value iteration and soft actor-critic (SAC). This highly flexible framework can be used for deterministic or stochastic systems as well as for discrete or continuous-time dynamics. Finally, we show that these Koopman Assisted Reinforcement Learning (KARL) algorithms attain state-of-the-art (SOTA) performance with respect to traditional neural network-based SAC and linear quadratic regulator (LQR) baselines on four controlled dynamical systems: a linear state-space system, the Lorenz system, fluid flow past a cylinder, and a double-well potential with non-isotropic stochastic forcing. Comment: 35 pages, 12 figures
نوع الوثيقة:	Working Paper
URL الوصول:	http://arxiv.org/abs/2403.02290
رقم الأكسشن:	edsarx.2403.02290
قاعدة البيانات:	arXiv

الوصف
الوصف غير متاح.