أعرض في EDS

FUZZY STATE AGGREGATION AND POLICY HILL CLIMBING FOR STOCHASTIC ENVIRONMENTS

التفاصيل البيبلوغرافية
العنوان:	FUZZY STATE AGGREGATION AND POLICY HILL CLIMBING FOR STOCHASTIC ENVIRONMENTS
المؤلفون:	Gilbert L. Peterson, Dean C. Wardell
المصدر:	International Journal of Computational Intelligence and Applications. :413-428
بيانات النشر:	World Scientific Pub Co Pte Lt, 2006.
سنة النشر:	2006
مصطلحات موضوعية:	business.industry, Computer science, Machine learning, computer.software_genre, ComputingMethodologies_ARTIFICIALINTELLIGENCE, Fuzzy logic, Computer Science Applications, Theoretical Computer Science, Domain (software engineering), Function approximation, Software agent, Reinforcement learning, Robot, Unsupervised learning, Artificial intelligence, business, Hill climbing, computer, Software
الوصف:	Reinforcement learning is one of the more attractive machine learning technologies, due to its unsupervised learning structure and ability to continually learn even as the operating environment changes. Additionally, by applying reinforcement learning to multiple cooperative software agents (a multi-agent system) not only allows each individual agent to learn from its own experience, but also opens up the opportunity for the individual agents to learn from the other agents in the system, thus accelerating the rate of learning. This research presents the novel use of fuzzy state aggregation, as the means of function approximation, combined with the fastest policy hill climbing methods of Win or Lose Fast (WoLF) and policy-dynamics based WoLF (PD-WoLF). The combination of fast policy hill climbing and fuzzy state aggregation function approximation is tested in two stochastic environments: Tileworld and the simulated robot soccer domain, RoboCup. The Tileworld results demonstrate that a single agent using the combination of FSA and PHC learns quicker and performs better than combined fuzzy state aggregation and Q-learning reinforcement learning alone. Results from the multi-agent RoboCup domain again illustrate that the policy hill climbing algorithms perform better than Q-learning alone in a multi-agent environment. The learning is further enhanced by allowing the agents to share their experience through a weighted strategy sharing.
تدمد:	1757-5885 1469-0268
URL الوصول:	https://explore.openaire.eu/search/publication?articleId=doi_________::7b1d1427ac6665c46f3079b875c65b76 https://doi.org/10.1142/s1469026806001903
رقم الأكسشن:	edsair.doi...........7b1d1427ac6665c46f3079b875c65b76
قاعدة البيانات:	OpenAIRE

الوصف
تدمد:	17575885 14690268