Reversible Upper Confidence Bound Algorithm to Generate Diverse Optimized Candidates

التفاصيل البيبلوغرافية
العنوان: Reversible Upper Confidence Bound Algorithm to Generate Diverse Optimized Candidates
المؤلفون: Chong, Bin, Yang, Yingguang, Wang, Zi-Le, Xing, Hang, Liu, Zhirong
المصدر: Phys. Chem. Chem. Phys. 23 (11), 6800-6806 (2021)
سنة النشر: 2021
المجموعة: Computer Science
مصطلحات موضوعية: Computer Science - Machine Learning
الوصف: Most algorithms for the multi-armed bandit problem in reinforcement learning aimed to maximize the expected reward, which are thus useful in searching the optimized candidate with the highest reward (function value) for diverse applications (e.g., AlphaGo). However, in some typical application scenaios such as drug discovery, the aim is to search a diverse set of candidates with high reward. Here we propose a reversible upper confidence bound (rUCB) algorithm for such a purpose, and demonstrate its application in virtual screening upon intrinsically disordered proteins (IDPs). It is shown that rUCB greatly reduces the query times while achieving both high accuracy and low performance loss.The rUCB may have potential application in multipoint optimization and other reinforcement-learning cases.
Comment: 10 pages, 10 figures
نوع الوثيقة: Working Paper
DOI: 10.1039/D0CP06378A
URL الوصول: http://arxiv.org/abs/2112.14893
رقم الأكسشن: edsarx.2112.14893
قاعدة البيانات: arXiv