Demiris, Yiannis
Enhancing Autonomous Vehicle Training with Language Model Integration and Critical Scenario Generation
Tian, Hanlin, Reddy, Kethan, Feng, Yuxiang, Quddus, Mohammed, Demiris, Yiannis, Angeloudis, Panagiotis
This paper introduces CRITICAL, a novel closed-loop framework for autonomous vehicle (AV) training and testing. CRITICAL stands out for its ability to generate diverse scenarios, focusing on critical driving situations that target specific learning and performance gaps identified in the Reinforcement Learning (RL) agent. The framework achieves this by integrating real-world traffic dynamics, driving behavior analysis, surrogate safety measures, and an optional Large Language Model (LLM) component. It is proven that the establishment of a closed feedback loop between the data generation pipeline and the training process can enhance the learning rate during training, elevate overall system performance, and augment safety resilience. Our evaluations, conducted using the Proximal Policy Optimization (PPO) and the HighwayEnv simulation environment, demonstrate noticeable performance improvements with the integration of critical case generation and LLM analysis, indicating CRITICAL's potential to improve the robustness of AV systems and streamline the generation of critical scenarios. This ultimately serves to hasten the development of AV agents, expand the general scope of RL training, and ameliorate validation efforts for AV safety.
On Specifying for Trustworthiness
Abeywickrama, Dhaminda B., Bennaceur, Amel, Chance, Greg, Demiris, Yiannis, Kordoni, Anastasia, Levine, Mark, Moffat, Luke, Moreau, Luc, Mousavi, Mohammad Reza, Nuseibeh, Bashar, Ramamoorthy, Subramanian, Ringert, Jan Oliver, Wilson, James, Windsor, Shane, Eder, Kerstin
As autonomous systems (AS) increasingly become part of our daily lives, ensuring their trustworthiness is crucial. In order to demonstrate the trustworthiness of an AS, we first need to specify what is required for an AS to be considered trustworthy. This roadmap paper identifies key challenges for specifying for trustworthiness in AS, as identified during the "Specifying for Trustworthiness" workshop held as part of the UK Research and Innovation (UKRI) Trustworthy Autonomous Systems (TAS) programme. We look across a range of AS domains with consideration of the resilience, trust, functionality, verifiability, security, and governance and regulation of AS and identify some of the key specification challenges in these domains. We then highlight the intellectual challenges that are involved with specifying for trustworthiness in AS that cut across domains and are exacerbated by the inherent uncertainty involved with the environments in which AS need to operate.
Self-Supervised RGB-T Tracking with Cross-Input Consistency
Zhang, Xingchen, Demiris, Yiannis
In this paper, we propose a self-supervised RGB-T tracking method. Different from existing deep RGB-T trackers that use a large number of annotated RGB-T image pairs for training, our RGB-T tracker is trained using unlabeled RGB-T video pairs in a self-supervised manner. We propose a novel cross-input consistency-based self-supervised training strategy based on the idea that tracking can be performed using different inputs. Specifically, we construct two distinct inputs using unlabeled RGB-T video pairs. We then track objects using these two inputs to generate results, based on which we construct our cross-input consistency loss. Meanwhile, we propose a reweighting strategy to make our loss function robust to low-quality training samples. We build our tracker on a Siamese correlation filter network. To the best of our knowledge, our tracker is the first self-supervised RGB-T tracker. Extensive experiments on two public RGB-T tracking benchmarks demonstrate that the proposed training strategy is effective. Remarkably, despite training only with a corpus of unlabeled RGB-T video pairs, our tracker outperforms seven supervised RGB-T trackers on the GTOT dataset.
Disentangled Sequence Clustering for Human Intention Inference
Zolotas, Mark, Demiris, Yiannis
Equipping robots with the ability to infer human intent is a vital precondition for effective collaboration. Most computational approaches towards this objective derive a probability distribution of "intent" conditioned on the robot's perceived state. However, these approaches typically assume task-specific labels of human intent are known a priori. To overcome this constraint, we propose the Disentangled Sequence Clustering Variational Autoencoder (DiSCVAE), a clustering framework capable of learning such a distribution of intent in an unsupervised manner. The proposed framework leverages recent advances in unsupervised learning to disentangle latent representations of sequence data, separating time-varying local features from time-invariant global attributes. As a novel extension, the DiSCVAE also infers a discrete variable to form a latent mixture model and thus enable clustering over these global sequence concepts, e.g. high-level intentions. We evaluate the DiSCVAE on a real-world human-robot interaction dataset collected using a robotic wheelchair. Our findings reveal that the inferred discrete variable coincides with human intent, holding promise for collaborative settings, such as shared control.
Random Expert Distillation: Imitation Learning via Expert Policy Support Estimation
Wang, Ruohan, Ciliberto, Carlo, Amadori, Pierluigi, Demiris, Yiannis
We consider a specific setting of imitation learning - the task of policy learning from expert demonstrations - in which the learner only has a finite number of expert trajectories without any further access to the expert. Two broad categories of approaches to this settings are behavioral cloning (BC) Pomerleau (1991), which directly learns a policy mapping from states to actions with supervised learning from expert trajectories; and inverse reinforcement learning (IRL) Ng & Russell (2000); Abbeel & Ng (2004), which learns a policy via reinforcement learning, using a cost function extracted from expert trajectories. Most notably, BC has been successfully applied to the task of autonomous driving Bojarski et al. (2016); Bansal et al. (2018). Despite its simplicity, BC typically requires a large amount of training data to learn good policies, as it may suffer from compounding errors caused by covariate shift Ross & Bagnell (2010); Ross et al. (2011). BC is often used as a policy initialization step for further reinforcement learning Nagabandi et al. (2018); Rajeswaran et al. (2017). IRL estimates a cost function from expert trajectories and uses reinforcement learning to derive policies. As the cost function evaluates the quality of trajectories rather than that of individual actions, IRL avoids the problem of compounding errors. IRL is effective with a wide range of problems, from continuous control benchmarks in the Mujoco environment Ho & Ermon (2016), to robot footsteps planning Ziebart et al. (2008). Generative Adversarial Imitation Learning (GAIL) Ho & Ermon (2016); Baram et al. (2017) connects IRL to the general framework of Generative Adversarial Networks (GANs) Goodfellow et al.
Real-Time Workload Classification during Driving using HyperNetworks
Wang, Ruohan, Amadori, Pierluigi V., Demiris, Yiannis
Classifying human cognitive states from behavioral and physiological signals is a challenging problem with important applications in robotics. The problem is challenging due to the data variability among individual users, and sensor artefacts. In this work, we propose an end-to-end framework for real-time cognitive workload classification with mixture Hyper Long Short Term Memory Networks, a novel variant of HyperNetworks. Evaluating the proposed approach on an eye-gaze pattern dataset collected from simulated driving scenarios of different cognitive demands, we show that the proposed framework outperforms previous baseline methods and achieves 83.9\% precision and 87.8\% recall during test. We also demonstrate the merit of our proposed architecture by showing improved performance over other LSTM-based methods.
DAC-h3: A Proactive Robot Cognitive Architecture to Acquire and Express Knowledge About the World and the Self
Moulin-Frier, Clรฉment, Fischer, Tobias, Petit, Maxime, Pointeau, Grรฉgoire, Puigbo, Jordi-Ysard, Pattacini, Ugo, Low, Sock Ching, Camilleri, Daniel, Nguyen, Phuong, Hoffmann, Matej, Chang, Hyung Jin, Zambelli, Martina, Mealier, Anne-Laure, Damianou, Andreas, Metta, Giorgio, Prescott, Tony J., Demiris, Yiannis, Dominey, Peter Ford, Verschure, Paul F. M. J.
This paper introduces a cognitive architecture for a humanoid robot to engage in a proactive, mixed-initiative exploration and manipulation of its environment, where the initiative can originate from both the human and the robot. The framework, based on a biologically-grounded theory of the brain and mind, integrates a reactive interaction engine, a number of state-of-the-art perceptual and motor learning algorithms, as well as planning abilities and an autobiographical memory. The architecture as a whole drives the robot behavior to solve the symbol grounding problem, acquire language capabilities, execute goal-oriented behavior, and express a verbal narrative of its own experience in the world. We validate our approach in human-robot interaction experiments with the iCub humanoid robot, showing that the proposed cognitive architecture can be applied in real time within a realistic scenario and that it can be used with naive users.
MAGAN: Margin Adaptation for Generative Adversarial Networks
Wang, Ruohan, Cully, Antoine, Chang, Hyung Jin, Demiris, Yiannis
We propose the Margin Adaptation for Generative Adversarial Networks (MAGANs) algorithm, a novel training procedure for GANs to improve stability and performance by using an adaptive hinge loss function. We estimate the appropriate hinge loss margin with the expected energy of the target distribution, and derive principled criteria for when to update the margin. We prove that our method converges to its global optimum under certain assumptions. Evaluated on the task of unsupervised image generation, the proposed training procedure is simple yet robust on a diverse set of data, and achieves qualitative and quantitative improvements compared to the state-of-the-art.
The Kernel Pitman-Yor Process
Chatzis, Sotirios P., Korkinof, Dimitrios, Demiris, Yiannis
In this work, we propose the kernel Pitman-Yor process (KPYP) for nonparametric clustering of data with general spatial or temporal interdependencies. The KPYP is constructed by first introducing an infinite sequence of random locations. Then, based on the stick-breaking construction of the Pitman-Yor process, we define a predictor-dependent random probability measure by considering that the discount hyperparameters of the Beta-distributed random weights (stick variables) of the process are not uniform among the weights, but controlled by a kernel function expressing the proximity between the location assigned to each weight and the given predictors.
The Third International Conference on Human-Robot Interaction
Fong, Terry (NASA Ames Research Center) | Dautenhahn, Kerstin (University of Hertfordshire) | Scheutz, Matthias (Indiana University) | Demiris, Yiannis (Imperial College)
The third international conference on Human-Robot Interaction (HRI-2008) was held in Amsterdam, The Netherland, March 12-15, 2008. The theme of HRI-2008, "Living With Robots", highlights the importance of the technical and social issues underlying human-robot interaction with companion and assistive robots for long-term use in everyday life and work activities. More than two hundred and fifty researchers, practitioners, and exhibitors attended the conference, and many more contributed to the conference as authors or reviewers. HRI-2009 will be held in San Diego, California from March 11-13, 2009.