Reward-Modulated Hebbian Learning of Decision Making

التفاصيل البيبلوغرافية
العنوان: Reward-Modulated Hebbian Learning of Decision Making
المؤلفون: Michael Pfeiffer, Rodney J. Douglas, Bernhard Nessler, Wolfgang Maass
المصدر: Neural Computation
بيانات النشر: MIT Press - Journals, 2010.
سنة النشر: 2010
مصطلحات موضوعية: Computer Science::Machine Learning, Computer science, Cognitive Neuroscience, Competitive learning, Decision Making, Action Potentials, Machine learning, computer.software_genre, Synaptic Transmission, 03 medical and health sciences, Synaptic weight, 0302 clinical medicine, Reward, Arts and Humanities (miscellaneous), Artificial Intelligence, Generalized Hebbian Algorithm, Anti-Hebbian learning, Leabra, Learning rule, Computer Simulation, 030304 developmental biology, Neurons, 0303 health sciences, Quantitative Biology::Neurons and Cognition, business.industry, Brain, Bayes Theorem, Mathematical Concepts, Hebbian theory, Unsupervised learning, Neural Networks, Computer, Artificial intelligence, Nerve Net, business, computer, Algorithms, 030217 neurology & neurosurgery
الوصف: We introduce a framework for decision making in which the learning of decision making is reduced to its simplest and biologically most plausible form: Hebbian learning on a linear neuron. We cast our Bayesian-Hebb learning rule as reinforcement learning in which certain decisions are rewarded and prove that each synaptic weight will on average converge exponentially fast to the log-odd of receiving a reward when its pre- and postsynaptic neurons are active. In our simple architecture, a particular action is selected from the set of candidate actions by a winner-take-all operation. The global reward assigned to this action then modulates the update of each synapse. Apart from this global reward signal, our reward-modulated Bayesian Hebb rule is a pure Hebb update that depends only on the coactivation of the pre- and postsynaptic neurons, not on the weighted sum of all presynaptic inputs to the postsynaptic neuron as in the perceptron learning rule or the Rescorla-Wagner rule. This simple approach to action-selection learning requires that information about sensory inputs be presented to the Bayesian decision stage in a suitably preprocessed form resulting from other adaptive processes (acting on a larger timescale) that detect salient dependencies among input features. Hence our proposed framework for fast learning of decisions also provides interesting new hypotheses regarding neural nodes and computational goals of cortical areas that provide input to the final decision stage.
تدمد: 1530-888X
0899-7667
URL الوصول: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::1b672367efab642eb0209eeb2c873efc
https://doi.org/10.1162/neco.2010.03-09-980
حقوق: CLOSED
رقم الأكسشن: edsair.doi.dedup.....1b672367efab642eb0209eeb2c873efc
قاعدة البيانات: OpenAIRE