Article

Learning partially observable action schemas

Authors:

Eyal AmirAuthors Info & Claims

AAAI'06: Proceedings of the 21st national conference on Artificial intelligence - Volume 1

Pages 913 - 919

Published: 16 July 2006 Publication History

Abstract

We present an algorithm that derives actions' effects and preconditions in partially observable, relational domains. Our algorithm has two unique features: an expressive relational language, and an exact tractable computation. An action-schema language that we present permits learning of preconditions and effects that include implicit objects and unstated relationships between objects. For example, we can learn that replacing a blown fuse turns on all the lights whose switch is set to on. The algorithm maintains and outputs a relational-logical representation of all possible action-schema models after a sequence of executed actions and partial observations. Importantly, our algorithm takes polynomial time in the number of time steps and predicates. Time dependence on other domain parameters varies with the action-schema language. Our experiments show that the relational structure speeds up both learning and generalization, and outperforms propositional learning methods. It also allows establishing apriori-unknown connections between objects (e.g. light bulbs and their switches), and permits learning conditional effects in realistic and complex situations. Our algorithm takes advantage of a DAG structure that can be updated efficiently and preserves compactness of representation.

References

[1]

E. Amir. Learning partially observable deterministic action models. In IJCAI '05. MK, 2005.

Digital Library

[2]

Saso Dzeroski and K. Driessens Luc De Raedt. Relational reinforcement learning. Machine Learning, 43(1-2):7-52, 2001.

Digital Library

[3]

E. Even-Dar, S. M. Kakade, and Y. Mansour. Reinforcement learning in POMDPs. In IJCAI '05, 2005.

Digital Library

[4]

N. Friedman, K. Murphy, and S. Russell. Learning the structure of dynamic probabilistic networks. In Proc. UAI '98. MK, 1998.

Digital Library

[5]

Nir Friedman, Lise Getoor, Daphne Koller, and Avi Pfeffer. Learning probabilistic relational models. In IJCAI '99, pages 1300-1307. MK, 1999.

Digital Library

[6]

Lise Getoor. Learning probabilistic relational models. Lecture Notes in Computer Science, 1864:1300-1307, 2000.

Digital Library

[7]

M. Ghallab, A. Howe, C. Knoblock, D. McDermott, A. Ram, M. Veloso, D. Weld, and D. Wilkins. PDDL - The Planning Domain Definition Language, version 1.2. Technical report, Yale center for computational vision and control, 1998.

[8]

Y. Gil. Learning by experimentation: Incremental refinement of incomplete planning domains. In Proc. ICML-94, 1994.

[9]

T. Jaakkola, S. P. Singh, and M. I. Jordan. Reinforcement learning algorithm for partially observable Markov decision problems. In Proc. NIPS'94, volume 7, 1994.

[10]

M. L. Littman. Algorithms for sequential decision making. PhD thesis, Department of Computer Science, Brown University, 1996. Technical report CS-96-09.

Digital Library

[11]

J. McCarthy. Applications of circumscription in formalizing common sense knowledge. Artificial Intelligence, 28:89-116, 1986.

Digital Library

[12]

S. Muggleton and W. Buntine. Machine invention of first-order predicates by inverting resolution. In Proc. ICML-88, 1988.

[13]

H. M. Pasula, L. S. Zettlemoyer, and L. P. Kaelbling. Learning probabilistic relational planning rules. In Proc. ICAPS'04, 2004.

[14]

R. Reiter. Knowledge In Action: Logical Foundations for Describing and Implementing Dynamical Systems. MIT Press, 2001.

Digital Library

[15]

R. S. Sutton and A. G. Barto. Reinforcement Learning: an introduction. MIT Press, 1998.

Digital Library

[16]

X. Wang. Learning by observation and practice: an incremental approach for planning operator acquisition. In Proc. ICML-95, pages 549-557. MK, 1995.

[17]

K. Wu, Q. Yang, and Y. Jiang. Arms: Action-relation modelling system for learning action models. Proc. ICAPS'05, 2005.

Cited By

Cresswell SGregory P(2011)Generalised domain model acquisition from action tracesProceedings of the Twenty-First International Conference on International Conference on Automated Planning and Scheduling10.5555/3038485.3038492(42-49)Online publication date: 11-Jun-2011
https://dl.acm.org/doi/10.5555/3038485.3038492
Stulp FBeetz M(2008)Refining the execution of abstract actions with learned action modelsJournal of Artificial Intelligence Research10.5555/1622673.162268532:1(487-523)Online publication date: 1-Jun-2008
https://dl.acm.org/doi/10.5555/1622673.1622685
Shahaf DChang AAmir E(2006)Learning partially observable action modelsProceedings of the 21st national conference on Artificial intelligence - Volume 110.5555/1597538.1597684(920-926)Online publication date: 16-Jul-2006
https://dl.acm.org/doi/10.5555/1597538.1597684

Recommendations

Learning action effects in partially observable domains
Proceedings of the 2010 conference on ECAI 2010: 19th European Conference on Artificial Intelligence

We investigate the problem of learning action effects in partially observable STRIPS planning domains. Our approach is based on a voted kernel perceptron learning model, where action and state information is encoded in a compact vector representation as ...
Learning reward machines for partially observable reinforcement learning
NIPS'19: Proceedings of the 33rd International Conference on Neural Information Processing Systems

Reward Machines (RMs) provide a structured, automata-based representation of a reward function that enables a Reinforcement Learning (RL) agent to decompose an RL problem into structured subproblems that can be efficiently learned via off-policy learning. ...
Partially Observable Mean Field Reinforcement Learning
AAMAS '21: Proceedings of the 20th International Conference on Autonomous Agents and MultiAgent Systems

Traditional multi-agent reinforcement learning algorithms are not scalable to environments with more than a few agents, since these algorithms are exponential in the number of agents. Recent research has introduced successful methods to scale multi-agent ...

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings

AAAI'06: Proceedings of the 21st national conference on Artificial intelligence - Volume 1

July 2006

1005 pages

ISBN:9781577352815

Editor:
Anthony Cohn
University of Leeds

Sponsors

AAAI: American Association for Artificial Intelligence

Publisher

AAAI Press

Publication History

Published: 16 July 2006

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

3
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 22 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Cresswell SGregory P(2011)Generalised domain model acquisition from action tracesProceedings of the Twenty-First International Conference on International Conference on Automated Planning and Scheduling10.5555/3038485.3038492(42-49)Online publication date: 11-Jun-2011
https://dl.acm.org/doi/10.5555/3038485.3038492
Stulp FBeetz M(2008)Refining the execution of abstract actions with learned action modelsJournal of Artificial Intelligence Research10.5555/1622673.162268532:1(487-523)Online publication date: 1-Jun-2008
https://dl.acm.org/doi/10.5555/1622673.1622685
Shahaf DChang AAmir E(2006)Learning partially observable action modelsProceedings of the 21st national conference on Artificial intelligence - Volume 110.5555/1597538.1597684(920-926)Online publication date: 16-Jul-2006
https://dl.acm.org/doi/10.5555/1597538.1597684

View Options

View options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents