Go to main content
Formats
Format
BibTeX
MARCXML
TextMARC
MARC
DataCite
DublinCore
EndNote
NLM
RefWorks
RIS

Files

Abstract

We target task-open systems where tasks are introduced dynamically in a multi-agent system, requiring continuous learning and adaptation from agents. This scenario is challenging due to the evolving action space and reward function, which conventional RL algorithms cannot handle. We present a novel decision-making framework, Task-Open MDP (TaO-MDP), to capture dynamic task arrivals and evolving environments. We further introduce a multi-agent RL algorithm, Models of Hyper Interactions under Task Openness (MOHITO), that learns a generalized policy for task-open environments. It employs interaction graphs to link agents, tasks, and actions, processed via graph neural networks. MOHITO uses centralized training and decentralized execution to select the best action from the available action space. Evaluated in a task-open Ridesharing domain, our algorithm facilitates knowledge transfer and boosts rewards by enabling agents to pool multiple passengers simultaneously. Results show agents achieve higher rewards compared to baseline methods, demonstrating MOHITO’s effectiveness in dynamic, task-open environments.

Details

PDF

Statistics

from
to
Export
Download Full History