Agents
Out of the Box: Embodied Navigation in the Real World
Bigazzi, Roberto, Landi, Federico, Cornia, Marcella, Cascianelli, Silvia, Baraldi, Lorenzo, Cucchiara, Rita
The research field of Embodied AI has witnessed substantial progress in visual navigation and exploration thanks to powerful simulating platforms and the availability of 3D data of indoor and photorealistic environments. These two factors have opened the doors to a new generation of intelligent agents capable of achieving nearly perfect PointGoal Navigation. However, such architectures are commonly trained with millions, if not billions, of frames and tested in simulation. Together with great enthusiasm, these results yield a question: how many researchers will effectively benefit from these advances? In this work, we detail how to transfer the knowledge acquired in simulation into the real world. To that end, we describe the architectural discrepancies that damage the Sim2Real adaptation ability of models trained on the Habitat simulator and propose a novel solution tailored towards the deployment in real-world scenarios. We then deploy our models on a LoCoBot, a Low-Cost Robot equipped with a single Intel RealSense camera. Different from previous work, our testing scene is unavailable to the agent in simulation. The environment is also inaccessible to the agent beforehand, so it cannot count on scene-specific semantic priors. In this way, we reproduce a setting in which a research group (potentially from other fields) needs to employ the agent visual navigation capabilities as-a-Service. Our experiments indicate that it is possible to achieve satisfying results when deploying the obtained model in the real world. Our code and models are available at https://github.com/aimagelab/LoCoNav.
Probabilistic Loss and its Online Characterization for Simplified Decision Making Under Uncertainty
Zhitnikov, Andrey, Indelman, Vadim
It is a long-standing objective to ease the computation burden incurred by the decision making process. Identification of this mechanism's sensitivity to simplification has tremendous ramifications. Yet, algorithms for decision making under uncertainty usually lean on approximations or heuristics without quantifying their effect. Therefore, challenging scenarios could severely impair the performance of such methods. In this paper, we extend the decision making mechanism to the whole by removing standard approximations and considering all previously suppressed stochastic sources of variability. On top of this extension, our key contribution is a novel framework to simplify decision making while assessing and controlling online the simplification's impact. Furthermore, we present novel stochastic bounds on the return and characterize online the effect of simplification using this framework on a particular simplification technique - reducing the number of samples in belief representation for planning. Finally, we verify the advantages of our approach through extensive simulations.
An Appraisal Transition System for Event-driven Emotions in Agent-based Player Experience Testing
Ansari, Saba Gholizadeh, Prasetya, I. S. W. B., Dastani, Mehdi, Dignum, Frank, Keller, Gabriele
Player experience (PX) evaluation has become a field of interest in the game industry. Several manual PX techniques have been introduced to assist developers to understand and evaluate the experience of players in computer games. However, automated testing of player experience still needs to be addressed. An automated player experience testing framework would allow designers to evaluate the PX requirements in the early development stages without the necessity of participating human players. In this paper, we propose an automated player experience testing approach by suggesting a formal model of event-based emotions. In particular, we discuss an event-based transition system to formalize relevant emotions using Ortony, Clore, & Collins (OCC) theory of emotions. A working prototype of the model is integrated on top of Aplib, a tactical agent programming library, to create intelligent PX test agents, capable of appraising emotions in a 3D game case study. The results are graphically shown e.g. as heat maps. Emotion visualization of the test agent would ultimately help game designers in creating content that evokes a certain experience in players.
Cursed yet Satisfied Agents
Chen, Yiling, Eden, Alon, Wang, Juntao
In real life auctions, a widely observed phenomenon is the winner's curse -- the winner's high bid implies that the winner often over-estimates the value of the good for sale, resulting in an incurred negative utility. The seminal work of Eyster and Rabin [Econometrica'05] introduced a behavioral model aimed to explain this observed anomaly. We term agents who display this bias "cursed agents". We adopt their model in the interdependent value setting, and aim to devise mechanisms that prevent the cursed agents from obtaining negative utility. We design mechanisms that are cursed ex-post IC, that is, incentivize agents to bid their true signal even though they are cursed, while ensuring that the outcome is individually rational -- the price the agents pay is no more than the agents' true value. Since the agents might over-estimate the good's value, such mechanisms might require the seller to make positive transfers to the agents to prevent agents from over-paying. For revenue maximization, we give the optimal deterministic and anonymous mechanism. For welfare maximization, we require ex-post budget balance (EPBB), as positive transfers might lead to negative revenue. We propose a masking operation that takes any deterministic mechanism, and imposes that the seller would not make positive transfers, enforcing EPBB. We show that in typical settings, EPBB implies that the mechanism cannot make any positive transfers, implying that applying the masking operation on the fully efficient mechanism results in a socially optimal EPBB mechanism. This further implies that if the valuation function is the maximum of agents' signals, the optimal EPBB mechanism obtains zero welfare. In contrast, we show that for sum-concave valuations, which include weighted-sum valuations and l_p-norms, the welfare optimal EPBB mechanism obtains half of the optimal welfare as the number of agents grows large.
Intelligent interactive technologies for mental health and well-being
Jovanovic, Mladjan, Jevremovic, Aleksandar, Pejovic-Milovancevic, Milica
The field received significant interest in the previous decade, mainly due to advances in automated machine learning (ML) and deep learning (DL). They learn useful patterns from a large amount of data and keep the acquired knowledge as model structures and parameters that can be further applied to make predictions by interpreting unseen data [1]. These models are either a set of elements or features that contribute when making decisions (in ML) or are organized into several layers of abstraction (such as neural networks) for general and specific interpretation tasks (in DL). Healthcare provision and medicine are one of the most significant challenges of AI due to being the pillars for a global society and the necessity of providing higher-quality assistance to the healthcare workforce [2]. An emerging and expanding domain for for application of AI is mental health. Readily available and ubiquitous devices and applications enable the provision of flexible mental care - on-demand, at any time, both at healthcare facilities and at home.
Visual Perspective Taking for Opponent Behavior Modeling
Chen, Boyuan, Hu, Yuhang, Kwiatkowski, Robert, Song, Shuran, Lipson, Hod
In order to engage in complex social interaction, humans learn at a young age to infer what others see and cannot see from a different point-of-view, and learn to predict others' plans and behaviors. These abilities have been mostly lacking in robots, sometimes making them appear awkward and socially inept. Here we propose an end-to-end long-term visual prediction framework for robots to begin to acquire both these critical cognitive skills, known as Visual Perspective Taking (VPT) and Theory of Behavior (TOB). We demonstrate our approach in the context of visual hide-and-seek - a game that represents a cognitive milestone in human development. Unlike traditional visual predictive model that generates new frames from immediate past frames, our agent can directly predict to multiple future timestamps (25s), extrapolating by 175% beyond the training horizon. We suggest that visual behavior modeling and perspective taking skills will play a critical role in the ability of physical robots to fully integrate into real-world multi-agent activities. Our website is at http://www.cs.columbia.edu/~bchen/vpttob/.
Zero-Shot Generalization using Intrinsically Motivated Compositional Emergent Protocols
Hazra, Rishi, Dixit, Sonu, Sen, Sayambhu
Human language has been described as a system that makes \textit{use of finite means to express an unlimited array of thoughts}. Of particular interest is the aspect of compositionality, whereby, the meaning of a compound language expression can be deduced from the meaning of its constituent parts. If artificial agents can develop compositional communication protocols akin to human language, they can be made to seamlessly generalize to unseen combinations. Studies have recognized the role of curiosity in enabling linguistic development in children. In this paper, we seek to use this intrinsic feedback in inducing a systematic and unambiguous protolanguage. We demonstrate how compositionality can enable agents to not only interact with unseen objects but also transfer skills from one task to another in a zero-shot setting: \textit{Can an agent, trained to `pull' and `push twice', `pull twice'?}.
Forecast Analysis of the COVID-19 Incidence in Lebanon: Prediction of Future Epidemiological Trends to Plan More Effective Control Programs
Ever since the COVID-19 pandemic started, all the governments have been trying to limit its effects on their citizens and countries. This pandemic was harsh on different levels for almost all populations worldwide and this is what drove researchers and scientists to get involved and work on several kinds of simulations to get a better insight into this virus and be able to stop it the earliest possible. In this study, we simulate the spread of COVID-19 in Lebanon using an Agent-Based Model where people are modeled as agents that have specific characteristics and behaviors determined from statistical distributions using Monte Carlo Algorithm. These agents can go into the world, interact with each other, and thus, infect each other. This is how the virus spreads. During the simulation, we can introduce different Non-Pharmaceutical Interventions - or more commonly NPIs - that aim to limit the spread of the virus (wearing a mask, closing locations, etc). Our Simulator was first validated on concepts (e.g. Flattening the Curve and Second Wave scenario), and then it was applied on the case of Lebanon. We studied the effect of opening schools and universities on the pandemic situation in the country since the Lebanese Ministry of Education is planning to do so progressively, starting from 21 April 2021. Based on the results we obtained, we conclude that it would be better to delay the school openings while the vaccination campaign is still slow in the country.
Adaptive Policy Transfer in Reinforcement Learning
Joshi, Girish, Chowdhary, Girish
Efficient and robust policy transfer remains a key challenge for reinforcement learning to become viable for real-wold robotics. Policy transfer through warm initialization, imitation, or interacting over a large set of agents with randomized instances, have been commonly applied to solve a variety of Reinforcement Learning tasks. However, this seems far from how skill transfer happens in the biological world: Humans and animals are able to quickly adapt the learned behaviors between similar tasks and learn new skills when presented with new situations. Here we seek to answer the question: Will learning to combine adaptation and exploration lead to a more efficient transfer of policies between domains? We introduce a principled mechanism that can "Adapt-to-Learn", that is adapt the source policy to learn to solve a target task with significant transition differences and uncertainties. We show that the presented method learns to seamlessly combine learning from adaptation and exploration and leads to a robust policy transfer algorithm with significantly reduced sample complexity in transferring skills between related tasks.
Budget-Constrained Coalition Strategies with Discounting
We assume that the values of propositional variables are not defined Discounting future costs and rewards is a common in the terminal state t. The agent a has multiple actions in practice in accounting, game theory, and machine each game state. These actions are depicted in Figure 1 using learning. In spite of this, existing logics for reasoning directed edges. The cost of each action to agent a is shown as about strategies with cost and resource constraints a label on the directed edge. For instance, the directed edge do not account for discounting. The paper from state w to state u with label 2 means that the agent a proposes a sound and complete logical system for has an action with cost 2 to transition the game from state w reasoning about budget-constrained strategic abilities to state u. Transitioning to the terminal state t represents the that incorporates discounting into its semantics.