Goto

Collaborating Authors

 Agents


MARS-Gym: A Gym framework to model, train, and evaluate Recommender Systems for Marketplaces

arXiv.org Machine Learning

Recommender Systems are especially challenging for marketplaces since they must maximize user satisfaction while maintaining the healthiness and fairness of such ecosystems. In this context, we observed a lack of resources to design, train, and evaluate agents that learn by interacting within these environments. For this matter, we propose MARS-Gym, an open-source framework to empower researchers and engineers to quickly build and evaluate Reinforcement Learning agents for recommendations in marketplaces. MARS-Gym addresses the whole development pipeline: data processing, model design and optimization, and multi-sided evaluation. We also provide the implementation of a diverse set of baseline agents, with a metrics-driven analysis of them in the Trivago marketplace dataset, to illustrate how to conduct a holistic assessment using the available metrics of recommendation, off-policy estimation, and fairness. With MARS-Gym, we expect to bridge the gap between academic research and production systems, as well as to facilitate the design of new algorithms and applications.


Multi-Agent Systems based on Contextual Defeasible Logic considering Focus

arXiv.org Artificial Intelligence

In this paper, we extend previous work on distributed reasoning using Contextual Defeasible Logic (CDL), which enables decentralised distributed reasoning based on a distributed knowledge base, such that the knowledge from different knowledge bases may conflict with each other. However, there are many use case scenarios that are not possible to represent in this model. One kind of such scenarios are the ones that require that agents share and reason with relevant knowledge when issuing a query to others. Another kind of scenarios are those in which the bindings among the agents (defined by means of mapping rules) are not static, such as in knowledge-intensive and dynamic environments. This work presents a multi-agent model based on CDL that not only allows agents to reason with their local knowledge bases and mapping rules, but also allows agents to reason about relevant knowledge (focus) -- which are not known by the agents a priori -- in the context of a specific query. We present a use case scenario, some formalisations of the model proposed, and an initial implementation based on the BDI (Belief-Desire-Intention) agent model.


Fast Decomposition of Temporal Logic Specifications for Heterogeneous Teams

arXiv.org Artificial Intelligence

In this work, we focus on decomposing large multi-agent path planning problems with global temporal logic goals (common to all agents) into smaller sub-problems that can be solved and executed independently. Crucially, the sub-problems' solutions must jointly satisfy the common global mission specification. The agents' missions are given as Capability Temporal Logic (CaTL) formulas, a fragment of signal temporal logic, that can express properties over tasks involving multiple agent capabilities (sensors, e.g., camera, IR, and effectors, e.g., wheeled, flying, manipulators) under strict timing constraints. The approach we take is to decompose both the temporal logic specification and the team of agents. We jointly reason about the assignment of agents to subteams and the decomposition of formulas using a satisfiability modulo theories (SMT) approach. The output of the SMT is then distributed to subteams and leads to a significant speed up in planning time. We include computational results to evaluate the efficiency of our solution, as well as the trade-offs introduced by the conservative nature of the SMT encoding.


Research and Education Towards Smart and Sustainable World

arXiv.org Artificial Intelligence

We propose a vision for directing research and education in the ICT field. Our Smart and Sustainable World vision targets at prosperity for the people and the planet through better awareness and control of both human-made and natural environment. The needs of the society, individuals, and industries are fulfilled with intelligent systems that sense their environment, make proactive decisions on actions advancing their goals, and perform the actions on the environment. We emphasize artificial intelligence, feedback loops, human acceptance and control, intelligent use of basic resources, performance parameters, mission-oriented interdisciplinary research, and a holistic systems view complementing the conventional analytical reductive view as a research paradigm especially for complex problems. To serve a broad audience, we explain these concepts and list the essential literature. We suggest planning research and education by specifying, in a step-wise manner, scenarios, performance criteria, system models, research problems and education content, resulting in common goals and a coherent project portfolio as well as education curricula. Research and education produce feedback to support evolutionary development and encourage creativity in research. Finally, we propose concrete actions for realizing this approach.


Agent Environment Cycle Games

arXiv.org Artificial Intelligence

Partially Observable Stochastic Games (POSGs), are the most general model of games used in Multi-Agent Reinforcement Learning (MARL), modeling actions and observations as happening sequentially for all agents. We introduce Agent Environment Cycle Games (AEC Games), a model of games based on sequential agent actions and observations. AEC Games can be thought of as sequential versions of POSGs, and we prove that they are equally powerful. We argue conceptually and through case studies that the AEC games model is useful in important scenarios in MARL for which the POSG model is not well suited. We additionally introduce "cyclically expansive curriculum learning," a new MARL curriculum learning method motivated by the AEC games model. It can be applied "for free," and experimentally we show this technique to achieve up to 35.1% more total reward on average.


Interaction-Based Trajectory Prediction Over a Hybrid Traffic Graph

arXiv.org Machine Learning

Behavior prediction of traffic actors is an essential component of any real-world self-driving system. Actors' long-term behaviors tend to be governed by their interactions with other actors or traffic elements (traffic lights, stop signs) in the scene. To capture this highly complex structure of interactions, we propose to use a hybrid graph whose nodes represent both the traffic actors as well as the static and dynamic traffic elements present in the scene. The different modes of temporal interaction (e.g., stopping and going) among actors and traffic elements are explicitly modeled by graph edges. This explicit reasoning about discrete interaction types not only helps in predicting future motion, but also enhances the interpretability of the model, which is important for safety-critical applications such as autonomous driving. We predict actors' trajectories and interaction types using a graph neural network, which is trained in a semi-supervised manner. We show that our proposed model, TrafficGraphNet, achieves state-of-the-art trajectory prediction accuracy while maintaining a high level of interpretability.


A Multi-Agent System for Solving the Dynamic Capacitated Vehicle Routing Problem with Stochastic Customers using Trajectory Data Mining

arXiv.org Artificial Intelligence

The worldwide growth of e-commerce has created new challenges for logistics companies, one of which is being able to deliver products quickly and at low cost, which reflects directly in the way of sorting packages, needing to eliminate steps such as storage and batch creation. Our work presents a multi-agent system that uses trajectory data mining techniques to extract territorial patterns and use them in the dynamic creation of last-mile routes. The problem can be modeled as a Dynamic Capacitated Vehicle Routing Problem (VRP) with Stochastic Customer, being therefore NP-HARD, what makes its implementation unfeasible for many packages. The work's main contribution is to solve this problem only depending on the Warehouse system configurations and not on the number of packages processed, which is appropriate for Big Data scenarios commonly present in the delivery of e-commerce products. Computational experiments were conducted for single and multi depot instances. Due to its probabilistic nature, the proposed approach presented slightly lower performances when compared to the static VRP algorithm. However, the operational gains that our solution provides making it very attractive for situations in which the routes must be set dynamically.


When bots do the negotiating, humans more likely to engage in deceptive techniques - Express Computer

#artificialintelligence

Recently computer scientists at USC Institute of Technologies (ICT) set out to assess under what conditions humans would employ deceptive negotiating tactics. Through a series of studies, they found that whether humans would embrace a range of deceptive and sneaky techniques was dependent both on the humans' prior negotiating experience in negotiating as well as whether virtual agents where employed to negotiate on their behalf. The findings stand in contrast to prior studies and show that when humans use intermediaries in the form of virtual agents, they feel more comfortable employing more deceptive techniques than they would normally use when negotiating for themselves. Lead author of the paper on these studies, Johnathan Mell, says, "We want to understand the conditions under which people act deceptively, in some cases purely by giving them an artificial intelligence agent that can do their dirty work for them." Nowadays, virtual agents are employed nearly everywhere, from automated bidders on sites like eBay to virtual assistants on smart phones.


With Whom to Communicate: Learning Efficient Communication for Multi-Robot Collision Avoidance

arXiv.org Artificial Intelligence

Decentralized multi-robot systems typically perform coordinated motion planning by constantly broadcasting their intentions as a means to cope with the lack of a central system coordinating the efforts of all robots. Especially in complex dynamic environments, the coordination boost allowed by communication is critical to avoid collisions between cooperating robots. However, the risk of collision between a pair of robots fluctuates through their motion and communication is not always needed. Additionally, constant communication makes much of the still valuable information shared in previous time steps redundant. This paper presents an efficient communication method that solves the problem of "when" and with "whom" to communicate in multi-robot collision avoidance scenarios. In this approach, every robot learns to reason about other robots' states and considers the risk of future collisions before asking for the trajectory plans of other robots. We evaluate and verify the proposed communication strategy in simulation with four quadrotors and compare it with three baseline strategies: non-communicating, broadcasting and a distance-based method broadcasting information with quadrotors within a predefined distance.


Towards a Systematic Computational Framework for Modeling Multi-Agent Decision-Making at Micro Level for Smart Vehicles in a Smart World

arXiv.org Artificial Intelligence

We propose a multi-agent based computational framework for modeling decision-making and strategic interaction at micro level for smart vehicles in a smart world. The concepts of Markov game and best response dynamics are heavily leveraged. Our aim is to make the framework conceptually sound and computationally practical for a range of realistic applications, including micro path planning for autonomous vehicles. To this end, we first convert the would-be stochastic game problem into a closely related deterministic one by introducing risk premium in the utility function for each individual agent. We show how the sub-game perfect Nash equilibrium of the simplified deterministic game can be solved by an algorithm based on best response dynamics. In order to better model human driving behaviors with bounded rationality, we seek to further simplify the solution concept by replacing the Nash equilibrium condition with a heuristic and adaptive optimization with finite look-ahead anticipation. In addition, the algorithm corresponding to the new solution concept drastically improves the computational efficiency. To demonstrate how our approach can be applied to realistic traffic settings, we conduct a simulation experiment: to derive merging and yielding behaviors on a double-lane highway with an unexpected barrier. Despite assumption differences involved in the two solution concepts, the derived numerical solutions show that the endogenized driving behaviors are very similar. We also briefly comment on how the proposed framework can be further extended in a number of directions in our forthcoming work, such as behavioral calibration using real traffic video data, computational mechanism design for traffic policy optimization, and so on.