skip to main content
10.5555/1619410.1619497guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Improving action selection in MDP's via knowledge transfer

Published: 09 July 2005 Publication History

Abstract

Temporal-difference reinforcement learning (RL) has been successfully applied in several domains with large state sets. Large action sets, however, have received considerably less attention. This paper demonstrates the use of knowledge transfer between related tasks to accelerate learning with large action sets. We introduce action transfer, a technique that extracts the actions from the (near-)optimal solution to the first task and uses them in place of the full action set when learning any subsequent tasks. When optimal actions make up a small fraction of the domain's action set, action transfer can substantially reduce the number of actions and thus the complexity of the problem. However, action transfer between dissimilar tasks can be detrimental. To address this difficulty, we contribute randomized task perturbation (RTP), an enhancement to action transfer that makes it robust to unrepresentative source tasks. We motivate RTP action transfer with a detailed theoretical analysis featuring a formalism of related tasks and a bound on the suboptimality of action transfer. The empirical results in this paper show the potential of RTP action transfer to substantially expand the applicability of RL to problems with large action sets.

References

[1]
Boutilier, C.; Reiter, R.; and Price, B. 200l. Symbolic dynamic programming for first-order MDPs. In Proc. 17th International Joint Conference on Artificial Intelligence (IJCAl-01), 690-697.
[2]
Crites, R. H., and Barto, A. G. 1996. Improving elevator performance using reinforcement learning. In Touretzky, D. S.; Mozer, M. C.; and Hasselmo, M. E., eds., Advances in Neural Information Processing Systems 8. Cambridge, MA: MIT Press.
[3]
Dietterich, T. G. 2000. Hierarchical reinforcement learning with the MAXQ value function decomposition. Journal of Artificial Intelligence Research 13:227-303.
[4]
Gaskett, C.; Weltergreen, D.; and Zelinsky, A. 1999. Q-learning in continuous state and action spaces. In Australian Joint Conference on Artificial Intelligence, 417-428.
[5]
Guestrin, C.; Koller, D.; Gearhart, C.; and Kanodia, N. 2003. Generalizing plans to new environments in relational MDPs. In Proc. 18th International Joint Conference on Artificial Intelligence (IJCAI-03).
[6]
Hauskrecht, M.; Meuleau, N.; Kaelbling, L. P.; Dean, T.; and Boutilier, C. 1998. Hierarchical solution of Markov decision processes using macro-actions. In Proc. Fourteenth Conference on Uncertainty in Artificial Intelligence (UAI-98), 220-229.
[7]
Santamaria, J. C.; Sutton, R. S.; and Ram, A. 1997. Experiments with reinforcement learning in problems with continuous state and action spaces. Adaptive Behavior 6(2):163-217.
[8]
Stone, P., and Sutton, R. S. 2001. Scaling reinforcement learning toward RoboCup soccer. In Proc. 18th International Conference on Machine Learning (ICML-01), 537-544. Morgan Kaufmann. San Francisco. CA.
[9]
Sutton, R., and Barto, A. 1998. Reinforcement Learning: An Introduction. Cambridge, MA: MIT Press.
[10]
Tesauro, G. 1994. TD-Gammon, a self-teaching backgammon program, achieves master-level play. Neural Computation 6(2):215-219.
[11]
Watkins, C. J. C. H. 1989. Learning from Delayed Rewards. Ph.D. Dissertation, Cambridge University.

Cited By

View all
  • (2019)A survey on transfer learning for multiagent reinforcement learning systemsJournal of Artificial Intelligence Research10.1613/jair.1.1139664:1(645-703)Online publication date: 1-Jan-2019
  • (2011)Knowledge Transfer between Automated PlannersAI Magazine10.1609/aimag.v32i2.233432:2(79-94)Online publication date: 1-Jun-2011
  • (2011)Stochastic abstract policies for knowledge transfer in robotic navigation tasksProceedings of the 10th Mexican international conference on Advances in Artificial Intelligence - Volume Part I10.1007/978-3-642-25324-9_39(454-465)Online publication date: 26-Nov-2011
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings
AAAI'05: Proceedings of the 20th national conference on Artificial intelligence - Volume 2
July 2005
1035 pages
ISBN:157735236x

Sponsors

  • Association for the Advancement of Artificial Intelligence

Publisher

AAAI Press

Publication History

Published: 09 July 2005

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 22 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2019)A survey on transfer learning for multiagent reinforcement learning systemsJournal of Artificial Intelligence Research10.1613/jair.1.1139664:1(645-703)Online publication date: 1-Jan-2019
  • (2011)Knowledge Transfer between Automated PlannersAI Magazine10.1609/aimag.v32i2.233432:2(79-94)Online publication date: 1-Jun-2011
  • (2011)Stochastic abstract policies for knowledge transfer in robotic navigation tasksProceedings of the 10th Mexican international conference on Advances in Artificial Intelligence - Volume Part I10.1007/978-3-642-25324-9_39(454-465)Online publication date: 26-Nov-2011
  • (2010)Improving space representation in multiagent learning via tile codingProceedings of the 20th Brazilian conference on Advances in artificial intelligence10.5555/1929622.1929641(153-162)Online publication date: 23-Oct-2010
  • (2010)Using spatial hints to improve policy reuse in a reinforcement learning agentProceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 110.5555/1838206.1838251(317-324)Online publication date: 10-May-2010
  • (2009)Exploring compact reinforcement-learning representations with linear regressionProceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence10.5555/1795114.1795183(591-598)Online publication date: 18-Jun-2009
  • (2009)Transfer Learning for Reinforcement Learning Domains: A SurveyThe Journal of Machine Learning Research10.5555/1577069.175583910(1633-1685)Online publication date: 1-Dec-2009
  • (2007)Learning to plan using harmonic analysis of diffusion modelsProceedings of the Seventeenth International Conference on International Conference on Automated Planning and Scheduling10.5555/3037176.3037206(224-231)Online publication date: 22-Sep-2007
  • (2006)Reusing and building a policy libraryProceedings of the Sixteenth International Conference on International Conference on Automated Planning and Scheduling10.5555/3037104.3037159(378-381)Online publication date: 6-Jun-2006
  • (2006)Probabilistic policy reuse in a reinforcement learning agentProceedings of the fifth international joint conference on Autonomous agents and multiagent systems10.1145/1160633.1160762(720-727)Online publication date: 8-May-2006
  • Show More Cited By

View Options

View options

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media