Intrinsic rewards explain context-sensitive valuation in reinforcement learning.

التفاصيل البيبلوغرافية
العنوان:	Intrinsic rewards explain context-sensitive valuation in reinforcement learning.
المؤلفون:	Molinaro G; Department of Psychology, University of California, Berkeley, Berkeley, California, United States of America., Collins AGE; Department of Psychology, University of California, Berkeley, Berkeley, California, United States of America.; Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, California, United States of America.
المصدر:	PLoS biology [PLoS Biol] 2023 Jul 17; Vol. 21 (7), pp. e3002201. Date of Electronic Publication: 2023 Jul 17 (Print Publication: 2023).
نوع المنشور:	Journal Article
اللغة:	English
بيانات الدورية:	Publisher: Public Library of Science Country of Publication: United States NLM ID: 101183755 Publication Model: eCollection Cited Medium: Internet ISSN: 1545-7885 (Electronic) Linking ISSN: 15449173 NLM ISO Abbreviation: PLoS Biol Subsets: MEDLINE
أسماء مطبوعة:	Original Publication: San Francisco, CA : Public Library of Science, [2003]-
مواضيع طبية MeSH:	Reinforcement, Psychology* , Reward*, Humans ; Learning ; Motivation
مستخلص:	When observing the outcome of a choice, people are sensitive to the choice's context, such that the experienced value of an option depends on the alternatives: getting $1 when the possibilities were 0 or 1 feels much better than when the possibilities were 1 or 10. Context-sensitive valuation has been documented within reinforcement learning (RL) tasks, in which values are learned from experience through trial and error. Range adaptation, wherein options are rescaled according to the range of values yielded by available options, has been proposed to account for this phenomenon. However, we propose that other mechanisms-reflecting a different theoretical viewpoint-may also explain this phenomenon. Specifically, we theorize that internally defined goals play a crucial role in shaping the subjective value attributed to any given option. Motivated by this theory, we develop a new "intrinsically enhanced" RL model, which combines extrinsically provided rewards with internally generated signals of goal achievement as a teaching signal. Across 7 different studies (including previously published data sets as well as a novel, preregistered experiment with replication and control studies), we show that the intrinsically enhanced model can explain context-sensitive valuation as well as, or better than, range adaptation. Our findings indicate a more prominent role of intrinsic, goal-dependent rewards than previously recognized within formal models of human RL. By integrating internally generated signals of reward, standard RL theories should better account for human behavior, including context-sensitive valuation and beyond. Competing Interests: The authors have declared that no competing interests exist. (Copyright: © 2023 Molinaro, Collins. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.)
References:	Psychol Rev. 1959 Mar;66(2):81-95. (PMID: 13645853) Arch Gen Psychiatry. 2012 Feb;69(2):129-38. (PMID: 22310503) Eur J Neurosci. 2008 May;27(9):2213-8. (PMID: 18445214) Nat Rev Neurosci. 2000 Dec;1(3):199-207. (PMID: 11257908) Nat Commun. 2015 Aug 25;6:8096. (PMID: 26302782) Trends Cogn Sci. 2019 Oct;23(10):836-850. (PMID: 31494042) PLoS Comput Biol. 2019 Jun 18;15(6):e1007043. (PMID: 31211783) Nat Hum Behav. 2022 Sep;6(9):1268-1279. (PMID: 35637297) Neuroimage. 2014 Jan 1;84:971-85. (PMID: 24018303) Elife. 2014 Dec 02;3:. (PMID: 25457346) J Neurosci. 2021 Oct 27;41(43):8963-8971. (PMID: 34544831) J Neurosci. 2016 Sep 28;36(39):10016-25. (PMID: 27683899) J Neurosci. 2011 Oct 12;31(41):14693-707. (PMID: 21994386) Psychon Bull Rev. 2013 Apr;20(2):364-71. (PMID: 23065763) Sci Adv. 2021 Apr 7;7(15):. (PMID: 33827810) Nat Commun. 2017 Oct 31;8(1):1208. (PMID: 29084949) Ann N Y Acad Sci. 2012 Mar;1251:13-32. (PMID: 22694213) Cereb Cortex. 2021 Nov 23;32(1):231-247. (PMID: 34231854) J Neurosci. 2014 Dec 3;34(49):16533-43. (PMID: 25471589) Trends Cogn Sci. 2017 Jun;21(6):425-433. (PMID: 28476348) Sci Adv. 2021 Apr 2;7(14):. (PMID: 33811071) Curr Opin Neurobiol. 2008 Apr;18(2):173-8. (PMID: 18692572) Elife. 2023 Jul 10;12:. (PMID: 37428155) Nat Hum Behav. 2022 Apr;6(4):555-564. (PMID: 35102348) Nat Commun. 2018 Oct 29;9(1):4503. (PMID: 30374019) J Neurosci. 2020 Apr 15;40(16):3268-3277. (PMID: 32156831) J Comput Neurosci. 2022 May;50(2):139-143. (PMID: 35122189) Neuroimage. 2005 May 1;25(4):1302-9. (PMID: 15945130) Nat Commun. 2017 Jun 20;8:16033. (PMID: 28631734) Elife. 2019 Nov 26;8:. (PMID: 31769410) Curr Opin Neurobiol. 2012 Dec;22(6):970-81. (PMID: 22939568) Cognition. 2023 Jan;230:105280. (PMID: 36099856) Curr Opin Behav Sci. 2021 Apr;38:66-73. (PMID: 35194556) Neuron. 2010 May 27;66(4):585-95. (PMID: 20510862) Neuron. 2019 Mar 6;101(5):977-987.e3. (PMID: 30683546) Cognition. 2019 Dec;193:104042. (PMID: 31430606) J Exp Psychol Learn Mem Cogn. 2023 Aug;49(8):1193-1217. (PMID: 35787139) Nat Hum Behav. 2020 Jan;4(1):14-19. (PMID: 31932690) Nat Commun. 2019 Oct 29;10(1):4926. (PMID: 31664035) Annu Rev Neurosci. 2011;34:333-59. (PMID: 21456961) Trends Cogn Sci. 2020 Jun;24(6):425-434. (PMID: 32392468) Proc Natl Acad Sci U S A. 2014 Feb 11;111(6):2343-8. (PMID: 24453218) Psychol Rev. 2019 Jan;126(1):52-88. (PMID: 30604988) Neurosci Biobehav Rev. 2022 Mar;134:104483. (PMID: 34902441) Neuron. 2010 Apr 15;66(1):138-48. (PMID: 20399735)
تواريخ الأحداث:	Date Created: 20230717 Date Completed: 20230731 Latest Revision: 20230731
رمز التحديث:	20230731
مُعرف محوري في PubMed:	PMC10374061
DOI:	10.1371/journal.pbio.3002201
PMID:	37459394
قاعدة البيانات:	MEDLINE

Find this article in full text from ProQuest

Full Text Finder

الوصف
تدمد:	1545-7885
DOI:	10.1371/journal.pbio.3002201