Farsang, Mónika
Depth Matters: Multimodal RGB-D Perception for Robust Autonomous Agents
Clement, Mihaela-Larisa, Farsang, Mónika, Resch, Felix, Grosu, Radu
Autonomous agents that rely purely on perception to make real-time control decisions require efficient and robust architectures. In this work, we demonstrate that augmenting RGB input with depth information significantly enhances our agents' ability to predict steering commands compared to using RGB alone. We benchmark lightweight recurrent controllers that leverage the fused RGB-D features for sequential decision-making. To train our models, we collect high-quality data using a small-scale autonomous car controlled by an expert driver via a physical steering wheel, capturing varying levels of steering difficulty. Our models, trained under diverse configurations, were successfully deployed on real hardware. Specifically, our findings reveal that the early fusion of depth data results in a highly robust controller, which remains effective even with frame drops and increased noise levels, without compromising the network's focus on the task.
MMDVS-LF: A Multi-Modal Dynamic-Vision-Sensor Line Following Dataset
Resch, Felix, Farsang, Mónika, Grosu, Radu
Dynamic Vision Sensors (DVS), offer a unique advantage in control applications, due to their high temporal resolution, and asynchronous event-based data. Still, their adoption in machine learning algorithms remains limited. To address this gap, and promote the development of models that leverage the specific characteristics of DVS data, we introduce the Multi-Modal Dynamic-Vision-Sensor Line Following dataset (MMDVS-LF). This comprehensive dataset, is the first to integrate multiple sensor modalities, including DVS recordings, RGB video, odometry, and Inertial Measurement Unit (IMU) data, from a small-scale standardized vehicle. Additionally, the dataset includes eye-tracking and demographic data of drivers performing a Line Following task on a track. With its diverse range of data, MMDVS-LF opens new opportunities for developing deep learning algorithms, and conducting data science projects across various domains, supporting innovation in autonomous systems and control applications.
Learning with Chemical versus Electrical Synapses -- Does it Make a Difference?
Farsang, Mónika, Lechner, Mathias, Lung, David, Hasani, Ramin, Rus, Daniela, Grosu, Radu
Bio-inspired neural networks have the potential to advance our understanding of neural computation and improve the state-of-the-art of AI systems. Bio-electrical synapses directly transmit neural signals, by enabling fast current flow between neurons. In contrast, bio-chemical synapses transmit neural signals indirectly, through neurotransmitters. Prior work showed that interpretable dynamics for complex robotic control, can be achieved by using chemical synapses, within a sparse, bio-inspired architecture, called Neural Circuit Policies (NCPs). However, a comparison of these two synaptic models, within the same architecture, remains an unexplored area. In this work we aim to determine the impact of using chemical synapses compared to electrical synapses, in both sparse and all-to-all connected networks. We conduct experiments with autonomous lane-keeping through a photorealistic autonomous driving simulator to evaluate their performance under diverse conditions and in the presence of noise. The experiments highlight the substantial influence of the architectural and synaptic-model choices, respectively. Our results show that employing chemical synapses yields noticeable improvements compared to electrical synapses, and that NCPs lead to better results in both synaptic models.
Conditionally Risk-Averse Contextual Bandits
Farsang, Mónika, Mineiro, Paul, Zhang, Wangda
Contextual bandits [Auer et al., 2002, Langford and Zhang, 2007] are a mature technology with numerous applications: however, adoption has been most aggressive in recommendation scenarios [Bouneffouf and Rish, 2019], where the worst-case outcome is user annoyance. At the other extreme are medical and defense scenarios where worst-case outcomes are literally fatal. In between are scenarios of interest where bad outcomes are tolerable but should be avoided, e.g., logistics; finance; and self-tuning software, where the term tail catastrophe highlights the inadequacy of average case performance guarantees in real-world applications [Marcus et al., 2021]. These scenarios demand risk-aversion, i.e., decisions should sacrifice average performance in order to avoid worst-case outcomes, and incorporating risk-aversion into contextual bandits would facilitate adoption. More generally, risk aversion is essential for making informed decisions that align with the risk preferences of the decision maker by balancing the potential benefits and risks of a particular action.
Importance of Environment Design in Reinforcement Learning: A Study of a Robotic Environment
Farsang, Mónika, Szegletes, Luca
An in-depth understanding of the particular environment is crucial in reinforcement learning (RL). To address this challenge, the decision-making process of a mobile collaborative robotic assistant modeled by the Markov decision process (MDP) framework is studied in this paper. The optimal state-action combinations of the MDP are calculated with the non-linear Bellman optimality equations. This system of equations can be solved with relative ease by the computational power of Wolfram Mathematica, where the obtained optimal action-values results point to the optimal policy. Unlike other RL algorithms, this methodology does not approximate the optimal behavior, it provides the exact, explicit solution, which provides a strong foundation for our study. With this, we offer new insights into understanding the action selection mechanisms in RL. During the analysis of the robotic environment, we present various small modifications on the very same schema that lead to different optimal policies. Finally, we emphasize that beyond building efficient RL algorithms, only the proper design of the environment can ensure the desired results.
Decaying Clipping Range in Proximal Policy Optimization
Farsang, Mónika, Szegletes, Luca
Proximal Policy Optimization (PPO) is among the most widely used algorithms in reinforcement learning, which achieves state-of-the-art performance in many challenging problems. The keys to its success are the reliable policy updates through the clipping mechanism and the multiple epochs of minibatch updates. The aim of this research is to give new simple but effective alternatives to the former. For this, we propose linearly and exponentially decaying clipping range approaches throughout the training. With these, we would like to provide higher exploration at the beginning and stronger restrictions at the end of the learning phase. We investigate their performance in several classical control and locomotive robotic environments. During the analysis, we found that they influence the achieved rewards and are effective alternatives to the constant clipping method in many reinforcement learning tasks.