Undirected Networks
Discrete Latent Structure in Neural Networks
Niculae, Vlad, Corro, Caio F., Nangia, Nikita, Mihaylova, Tsvetomila, Martins, Andrรฉ F. T.
Many types of data from fields including natural language processing, computer vision, and bioinformatics, are well represented by discrete, compositional structures such as trees, sequences, or matchings. Latent structure models are a powerful tool for learning to extract such representations, offering a way to incorporate structural bias, discover insight about the data, and interpret decisions. However, effective training is challenging, as neural networks are typically designed for continuous computation. This text explores three broad strategies for learning with discrete latent structure: continuous relaxation, surrogate gradients, and probabilistic estimation. Our presentation relies on consistent notations for a wide range of models. As such, we reveal many new connections between latent structure learning strategies, showing how most consist of the same small set of fundamental building blocks, but use them differently, leading to substantially different applicability and properties.
Logic programming for deliberative robotic task planning
Meli, Daniele, Nakawala, Hirenkumar, Fiorini, Paolo
Over the last decade, the use of robots in production and daily life has increased. With increasingly complex tasks and interaction in different environments including humans, robots are required a higher level of autonomy for efficient deliberation. Task planning is a key element of deliberation. It combines elementary operations into a structured plan to satisfy a prescribed goal, given specifications on the robot and the environment. In this manuscript, we present a survey on recent advances in the application of logic programming to the problem of task planning. Logic programming offers several advantages compared to other approaches, including greater expressivity and interpretability which may aid in the development of safe and reliable robots. We analyze different planners and their suitability for specific robotic applications, based on expressivity in domain representation, computational efficiency and software implementation. In this way, we support the robotic designer in choosing the best tool for his application.
Heterogeneous Multi-Robot Reinforcement Learning
Bettini, Matteo, Shankar, Ajay, Prorok, Amanda
Cooperative multi-robot tasks can benefit from heterogeneity in the robots' physical and behavioral traits. In spite of this, traditional Multi-Agent Reinforcement Learning (MARL) frameworks lack the ability to explicitly accommodate policy heterogeneity, and typically constrain agents to share neural network parameters. This enforced homogeneity limits application in cases where the tasks benefit from heterogeneous behaviors. In this paper, we crystallize the role of heterogeneity in MARL policies. Towards this end, we introduce Heterogeneous Graph Neural Network Proximal Policy Optimization (HetGPPO), a paradigm for training heterogeneous MARL policies that leverages a Graph Neural Network for differentiable inter-agent communication. HetGPPO allows communicating agents to learn heterogeneous behaviors while enabling fully decentralized training in partially observable environments. We complement this with a taxonomical overview that exposes more heterogeneity classes than previously identified. To motivate the need for our model, we present a characterization of techniques that homogeneous models can leverage to emulate heterogeneous behavior, and show how this "apparent heterogeneity" is brittle in real-world conditions. Through simulations and real-world experiments, we show that: (i) when homogeneous methods fail due to strong heterogeneous requirements, HetGPPO succeeds, and, (ii) when homogeneous methods are able to learn apparently heterogeneous behaviors, HetGPPO achieves higher resilience to both training and deployment noise.
Sleep Activity Recognition and Characterization from Multi-Source Passively Sensed Data
Martรญnez-Garcรญa, Marรญa, Moreno-Pino, Fernando, Olmos, Pablo M., Artรฉs-Rodrรญguez, Antonio
Sleep constitutes a key indicator of human health, performance, and quality of life. Sleep deprivation has long been related to the onset, development, and worsening of several mental and metabolic disorders, constituting an essential marker for preventing, evaluating, and treating different health conditions. Sleep Activity Recognition methods can provide indicators to assess, monitor, and characterize subjects' sleep-wake cycles and detect behavioral changes. In this work, we propose a general method that continuously operates on passively sensed data from smartphones to characterize sleep and identify significant sleep episodes. Thanks to their ubiquity, these devices constitute an excellent alternative data source to profile subjects' biorhythms in a continuous, objective, and non-invasive manner, in contrast to traditional sleep assessment methods that usually rely on intrusive and subjective procedures. A Heterogeneous Hidden Markov Model is used to model a discrete latent variable process associated with the Sleep Activity Recognition task in a self-supervised way. We validate our results against sleep metrics reported by tested wearables, proving the effectiveness of the proposed approach and advocating its use to assess sleep without more reliable sources.
Continuous Trajectory Generation Based on Two-Stage GAN
Jiang, Wenjun, Zhao, Wayne Xin, Wang, Jingyuan, Jiang, Jiawei
Simulating the human mobility and generating large-scale trajectories are of great use in many real-world applications, such as urban planning, epidemic spreading analysis, and geographic privacy protect. Although many previous works have studied the problem of trajectory generation, the continuity of the generated trajectories has been neglected, which makes these methods useless for practical urban simulation scenarios. To solve this problem, we propose a novel two-stage generative adversarial framework to generate the continuous trajectory on the road network, namely TS-TrajGen, which efficiently integrates prior domain knowledge of human mobility with model-free learning paradigm. Specifically, we build the generator under the human mobility hypothesis of the A* algorithm to learn the human mobility behavior. For the discriminator, we combine the sequential reward with the mobility yaw reward to enhance the effectiveness of the generator. Finally, we propose a novel two-stage generation process to overcome the weak point of the existing stochastic generation process. Extensive experiments on two real-world datasets and two case studies demonstrate that our framework yields significant improvements over the state-of-the-art methods.
Risk-Averse Reinforcement Learning via Dynamic Time-Consistent Risk Measures
Traditional reinforcement learning (RL) aims to maximize the expected total reward, while the risk of uncertain outcomes needs to be controlled to ensure reliable performance in a risk-averse setting. In this paper, we consider the problem of maximizing dynamic risk of a sequence of rewards in infinite-horizon Markov Decision Processes (MDPs). We adapt the Expected Conditional Risk Measures (ECRMs) to the infinite-horizon risk-averse MDP and prove its time consistency. Using a convex combination of expectation and conditional value-at-risk (CVaR) as a special one-step conditional risk measure, we reformulate the risk-averse MDP as a risk-neutral counterpart with augmented action space and manipulation on the immediate rewards. We further prove that the related Bellman operator is a contraction mapping, which guarantees the convergence of any value-based RL algorithms. Accordingly, we develop a risk-averse deep Q-learning framework, and our numerical studies based on two simple MDPs show that the risk-averse setting can reduce the variance and enhance robustness of the results.
Semantic and Effective Communication for Remote Control Tasks with Dynamic Feature Compression
Talli, Pietro, Pase, Francesco, Chiariotti, Federico, Zanella, Andrea, Zorzi, Michele
The coordination of robotic swarms and the remote wireless control of industrial systems are among the major use cases for 5G and beyond systems: in these cases, the massive amounts of sensory information that needs to be shared over the wireless medium can overload even high-capacity connections. Consequently, solving the effective communication problem by optimizing the transmission strategy to discard irrelevant information can provide a significant advantage, but is often a very complex task. In this work, we consider a prototypal system in which an observer must communicate its sensory data to an actor controlling a task (e.g., a mobile robot in a factory). We then model it as a remote Partially Observable Markov Decision Process (POMDP), considering the effect of adopting semantic and effective communication-oriented solutions on the overall system performance. We split the communication problem by considering an ensemble Vector Quantized Variational Autoencoder (VQ-VAE) encoding, and train a Deep Reinforcement Learning (DRL) agent to dynamically adapt the quantization level, considering both the current state of the environment and the memory of past messages. We tested the proposed approach on the well-known CartPole reference control problem, obtaining a significant performance increase over traditional approaches
World Models and Predictive Coding for Cognitive and Developmental Robotics: Frontiers and Challenges
Taniguchi, Tadahiro, Murata, Shingo, Suzuki, Masahiro, Ognibene, Dimitri, Lanillos, Pablo, Ugur, Emre, Jamone, Lorenzo, Nakamura, Tomoaki, Ciria, Alejandra, Lara, Bruno, Pezzulo, Giovanni
Creating autonomous robots that can actively explore the environment, acquire knowledge and learn skills continuously is the ultimate achievement envisioned in cognitive and developmental robotics. Their learning processes should be based on interactions with their physical and social world in the manner of human learning and cognitive development. Based on this context, in this paper, we focus on the two concepts of world models and predictive coding. Recently, world models have attracted renewed attention as a topic of considerable interest in artificial intelligence. Cognitive systems learn world models to better predict future sensory observations and optimize their policies, i.e., controllers. Alternatively, in neuroscience, predictive coding proposes that the brain continuously predicts its inputs and adapts to model its own dynamics and control behavior in its environment. Both ideas may be considered as underpinning the cognitive development of robots and humans capable of continual or lifelong learning. Although many studies have been conducted on predictive coding in cognitive robotics and neurorobotics, the relationship between world model-based approaches in AI and predictive coding in robotics has rarely been discussed. Therefore, in this paper, we clarify the definitions, relationships, and status of current research on these topics, as well as missing pieces of world models and predictive coding in conjunction with crucially related concepts such as the free-energy principle and active inference in the context of cognitive and developmental robotics. Furthermore, we outline the frontiers and challenges involved in world models and predictive coding toward the further integration of AI and robotics, as well as the creation of robots with real cognitive and developmental capabilities in the future.
Fairness and Sequential Decision Making: Limits, Lessons, and Opportunities
Nashed, Samer B., Svegliato, Justin, Blodgett, Su Lin
As automated decision making and decision assistance systems become common in everyday life, research on the prevention or mitigation of potential harms that arise from decisions made by these systems has proliferated. However, various research communities have independently conceptualized these harms, envisioned potential applications, and proposed interventions. The result is a somewhat fractured landscape of literature focused generally on ensuring decision-making algorithms "do the right thing". In this paper, we compare and discuss work across two major subsets of this literature: algorithmic fairness, which focuses primarily on predictive systems, and ethical decision making, which focuses primarily on sequential decision making and planning. We explore how each of these settings has articulated its normative concerns, the viability of different techniques for these different settings, and how ideas from each setting may have utility for the other.
A survey and taxonomy of loss functions in machine learning
Ciampiconi, Lorenzo, Elwood, Adam, Leonardi, Marco, Mohamed, Ashraf, Rozza, Alessandro
Most state-of-the-art machine learning techniques revolve around the optimisation of loss functions. Defining appropriate loss functions is therefore critical to successfully solving problems in this field. We present a survey of the most commonly used loss functions for a wide range of different applications, divided into classification, regression, ranking, sample generation and energy based modelling. Overall, we introduce 33 different loss functions and we organise them into an intuitive taxonomy. Each loss function is given a theoretical backing and we describe where it is best used. This survey aims to provide a reference of the most essential loss functions for both beginner and advanced machine learning practitioners.