research-article

Modeling reinforcement learning algorithms for performance analysis

Authors:

Shrirang Ambaji Kulkarni,

G Raghavendra RaoAuthors Info & Claims

ICAC3 '09: Proceedings of the International Conference on Advances in Computing, Communication and Control

Pages 35 - 39

https://doi.org/10.1145/1523103.1523111

Published: 23 January 2009 Publication History

Abstract

Reinforcement Learning Algorithms present interesting learning techniques. Here an autonomous agent interacts with its environment to choose optimal actions to achieve its goals. The performance of an agent is determined by how quickly it learns and converges to an optimal solution. Q-learning and Prioritized sweeping provide interesting techniques to achieve this. In this paper we try to analyze the performance of Q-learning and Prioritized sweeping as examples of model free and model based reinforcement learning. We also try to analyze the optimal number of backups required for prioritized sweeping. We model the results of prioritized sweeping as a regression model and discuss the prediction of the model by comparing it with the accuracy of our simulation results.

References

[1]

Alpaydin Ethem. 2006. Introduction to Machine Learning, Prentice Hall of India, 370--375.

[2]

Claus Caroline, Boutilier Craig, The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems, http://citeseer.ist.psu.edu/claus98dynamics.html.

[3]

Givan Bob, Parr Ron, An Introduction to Markov decision Processes", http://www.cs.rice.edu/~vardi/dag01/givan1.pdf

[4]

Harmon Mance E, Harmon Stephanie S, Reinforcement Learning: A Tutorial, http://www.nbu.bg/cogs/events/2000/SummerSchool//Readings/Petrov/rltutorial.pdf

[5]

Kaelbling Leslie Park, Littman Michael L, Moore Andrew W, 1996. "Reinforcement Learning: A survey, Journal of Artificial Intelligence Research", Volume 4, pp. 237--285.

Digital Library

[6]

Keerthi S Sathiya, Ravindran B, A Tutorial Survey of Reinforcement Learning, http://citeseer.ist.psu.edu/50807.html.

[7]

Kelkar Rohit, Mehta Vivek, "Markovian Decision Processes and Reinforcement Learning", http://www.cs.cmu.edu/~awm/rlsim/

[8]

Littman Michael L., 1994. "Markov games as a framework for multi-agent reinforcement learning", In proceedings of the Eleventh International Conference on Machine Learning, San Francisco, CA, Morgan Kaufmann, 157--163.

[9]

Luger George F, 2002. Artificial Intelligence -- Structures and Strategies for Complex Problem Solving, Pearson Education, 4th Edition, 406--412.

Digital Library

[10]

Moore A. W, Atkeson, C G, 1993, "Prioritized Sweeping: Reinforcement Learning with Less data and Less Time", Machine Learning, Vol 13.

Digital Library

[11]

Peng J. and Williams R. J. 1993, Efficient learning and planning within the dyna framework, In Proceedings of the 2^nd International Conference on Simulation of Adaptive Behavior, Hawaii, http://citeseer.nj.nec.com/peng93efficient.html.

Digital Library

[12]

Stockburger David W, 1998, Introductory Statistics: Concepts, Models and Applications, http://business.clayton.edu/arjomand/book/sbk00.html

[13]

Sutton R S, Barto A G. 1998. Reinforcement Learning: An Introduction, MIT Press, Cambridge, MA.

Digital Library

[14]

Walpole R E., Myers R H. and Myers Sharon L, 1998, Probability and Statistics for Engineers and Scientists, Prentice Hall, 6^th Edition, 361--363.

[15]

Watkins, C. J. C. H. 1989. "Learning with Delayed Rewards", Ph.D. Dissertation, Cambridge University.

[16]

Watkins C. J. C. H and Dayan P, 1992. Technical Note: Q-Learning, Machine Learning, 8:279--292.

Digital Library

Cited By

Gurupur VKulkarni SLiu XDesai UNasir A(2018)Analysing the power of deep learning techniques over the traditional methods using medicare utilisation and provider dataJournal of Experimental & Theoretical Artificial Intelligence10.1080/0952813X.2018.151899931:1(99-115)Online publication date: 12-Sep-2018
https://doi.org/10.1080/0952813X.2018.1518999
Naik NNegi A(2017)A learning-based mapreduce scheduler in heterogeneous environments2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI)10.1109/ICACCI.2017.8126142(2020-2025)Online publication date: Sep-2017
https://doi.org/10.1109/ICACCI.2017.8126142

Modeling reinforcement learning algorithms for performance analysis
1. Computing methodologies

Recommendations

Introspective Reinforcement Learning and Learning from Demonstration
AAMAS '18: Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems

Reinforcement learning is a paradigm to model how an autonomous agent learns to maximise its cumulative reward by interacting with the environment. One challenge faced by reinforcement learning is that in many environments the reward signal is sparse, ...
Evaluation of reinforcement learning techniques
IITM '10: Proceedings of the First International Conference on Intelligent Interactive Technologies and Multimedia

Reinforcement learning is became one of the most important approaches to machine intelligence. Now RL is widely use by different research field as intelligent control, robotics and neuroscience. It provides us possible solution within unknown ...
A Huber reward function-driven deep reinforcement learning solution for cart-pole balancing problem
Abstract
Lots of learning tasks require experience learning based on activities performed in real scenarios which are affected by environmental factors. Therefore, real-time systems demand a model to learn from working experience—such as physical object ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ICAC3 '09: Proceedings of the International Conference on Advances in Computing, Communication and Control

January 2009

707 pages

ISBN:9781605583518

DOI:10.1145/1523103

Copyright � 2009 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGAI: ACM Special Interest Group on Artificial Intelligence

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 23 January 2009

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

ICAC3 '09

Sponsor:

SIGAI

ICAC3 '09: International Conference on Advances in Computing, Communication and Control

January 23 - 24, 2009

Mumbai, India

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
179
Total Downloads

Downloads (Last 12 months)3
Downloads (Last 6 weeks)0

Reflects downloads up to 22 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Gurupur VKulkarni SLiu XDesai UNasir A(2018)Analysing the power of deep learning techniques over the traditional methods using medicare utilisation and provider dataJournal of Experimental & Theoretical Artificial Intelligence10.1080/0952813X.2018.151899931:1(99-115)Online publication date: 12-Sep-2018
https://doi.org/10.1080/0952813X.2018.1518999
Naik NNegi A(2017)A learning-based mapreduce scheduler in heterogeneous environments2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI)10.1109/ICACCI.2017.8126142(2020-2025)Online publication date: Sep-2017
https://doi.org/10.1109/ICACCI.2017.8126142

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents