Files
Abstract
Machine Learning and Robotics present a very intriguing combination of research in Artificial Intelligence. Inverse Reinforcement Learning (IRL) algorithms have generated a great deal of interest in the AI community in recent years. However, very little research has been done on modelling agent interactions in multi-robot ad-hoc settings after learning is complete. Moreover, incorporating IRL for practical robot environments that deal with online learning and high levels of uncertainty is a challenge. While decision theoretic frameworks used for planning in these environments provide good approximations for computing an optimal policy for an agent, these model parameters are usually specified by a human designer. We describe a unique Bayesian approach to approximate unknown state transition functions. We then propose a novel multi-agent Best Response Model that plugs in the experts reward structure learnt through Maximum Entropy Inverse Reinforcement Learning, and use the learnt transition functions from our Bayes Adaptive approach to compute an optimal best response policy for our multi-robot ad-hoc setting. We test our algorithms on a robot debris-sorting task.