In this paper we look at the Multi-Agent Meeting Scheduling problem where distributed agents negotiate meeting times on behalf of their users. While many negotiation approaches have been proposed for scheduling meetings it is not well understood how agents can negotiate strategically in order to maximize their users' utility. To negotiate strategically an agent needs to learn to pick good strategies for each agent. We show how the playbook approach introduced by (Bowling, Browning, & Veloso 2004) for team plan selection in small-size robot soccer can be used to select strategies. Selecting strategies in this way gives some theoretical guarantees about regret. We also show experimental results demonstrating the effectiveness of the approach.
One of the most important elements of agent performance in multi-agent systems is the ability for an agent to predict how other agents will behave. In many domains there are often different modeling systems already available that one could use to make behavior predictions, but the choice of the best one for a particular domain and a specific set of agents is often unclear. To find the best available prediction, we would like to know which model would perform best in each possible world state of the domain. However, when we have limited resources and each prediction query has a cost we may need to decide which queries to pursue using only estimates of their benefit and cost: metareasoning. To estimate the benefit of the computation, a metareasoner needs a robust measurement of performance quality. In this work we present a metareasoning system that relies on a prediction performance measurement, and we propose a novel model performance measurement that fulfils this need: Weighted Prediction Divergence.
Pruning away leas important models allows an agent to take its "best action in a timely manner, given its knowledge, computational capabilities, and time constraints. We describe a theoretical framework, based on situations, for talking about recursive agent models and the strategies and expected strategies associated with them. This framework allows us to rigorously define the gain of continuing deliberation versus taking action. The expected gain of computational actions is used to guide the pruning of the nested model structure. We have implemented our approach on a canonical multi-agent problem, the pursuit task, to illustrate how real-time, multi-agent decision-making can be based on a principled, combinatorial model. Test results show a marked decrease in deliberation time while maintaining a good performance level.
This continues until B accepts A's offer or gives up. Future work We intend to perform experiments with dynamic networks. In particular, we are interested in updating the topology of the network based on experience. Based on the result of each iteration of the negotiation, the modeling agent can delete or insert nodes and connections as well as alter the probabilities in the network. Additionally, new actions may be considered or old actions may be eliminated before embarking on to the next round of negotiation.