Petiushko, Aleksandr
SafeAuto: Knowledge-Enhanced Safe Autonomous Driving with Multimodal Foundation Models
Zhang, Jiawei, Yang, Xuan, Wang, Taiqi, Yao, Yu, Petiushko, Aleksandr, Li, Bo
Traditional autonomous driving systems often struggle to integrate high-level reasoning with low-level control, resulting in suboptimal and sometimes unsafe driving behaviors. The emergence of Multimodal Large Language Models (MLLMs), which can process both visual and textual data, presents an opportunity to unify perception and reasoning tasks within a single framework. However, effectively embedding precise safety knowledge into MLLMs for autonomous driving remains a significant challenge. To address this, we propose SafeAuto, a novel framework that enhances MLLM-based autonomous driving systems by incorporating both unstructured and structured knowledge. Specifically, we first introduce the Position-Dependent Cross-Entropy (PDCE) loss function, designed to improve the accuracy of low-level control signal predictions when numerical values are represented as text. Second, to ensure safe autonomous driving by explicitly integrating precise safety knowledge into the MLLM, we develop a reasoning component for SafeAuto. This component translates driving safety regulations into first-order logic rules (e.g., "red light => stop") and incorporates these rules into a probabilistic graphical model, such as a Markov Logic Network (MLN). The MLN is trained to verify the predicted next actions using environmental attributes identified by attribute recognition models (e.g., detecting a red light) to form the predicates. Additionally, we construct a Multimodal RAG model that leverages video data, control signals, and environmental attributes to learn more effectively from past similar driving experiences. By integrating PDCE, MLN, and Multimodal RAG, SafeAuto significantly outperforms existing baselines across multiple datasets. This advancement enables more accurate, reliable, and safer autonomous driving systems that learn from experience, obey traffic laws, and perform precise control actions.
CIMRL: Combining IMitation and Reinforcement Learning for Safe Autonomous Driving
Booher, Jonathan, Rohanimanesh, Khashayar, Xu, Junhong, Isenbaev, Vladislav, Balakrishna, Ashwin, Gupta, Ishan, Liu, Wei, Petiushko, Aleksandr
Modern approaches to autonomous driving rely heavily on learned components trained with large amounts of human driving data via imitation learning. However, these methods require large amounts of expensive data collection and even then face challenges with safely handling long-tail scenarios and compounding errors over time. At the same time, pure Reinforcement Learning (RL) methods can fail to learn performant policies in sparse, constrained, and challenging-to-define reward settings like driving. Both of these challenges make deploying purely cloned policies in safety critical applications like autonomous vehicles challenging. In this paper we propose Combining IMitation and Reinforcement Learning (CIMRL) approach - a framework that enables training driving policies in simulation through leveraging imitative motion priors and safety constraints. CIMRL does not require extensive reward specification and improves on the closed loop behavior of pure cloning methods. By combining RL and imitation, we demonstrate that our method achieves state-of-the-art results in closed loop simulation driving benchmarks.
Multi-Constraint Safe RL with Objective Suppression for Safety-Critical Applications
Zhou, Zihan, Booher, Jonathan, Rohanimanesh, Khashayar, Liu, Wei, Petiushko, Aleksandr, Garg, Animesh
Safe reinforcement learning tasks with multiple constraints are a challenging domain despite being very common in the real world. In safety-critical domains, properly handling the constraints becomes even more important. To address this challenge, we first describe the multi-constraint problem with a stronger Uniformly Constrained MDP (UCMDP) model; we then propose Objective Suppression, a novel method that adaptively suppresses the task reward maximizing objectives according to a safety critic, as a solution to the Lagrangian dual of a UCMDP. We benchmark Objective Suppression in two multi-constraint safety domains, including an autonomous driving domain where any incorrect behavior can lead to disastrous consequences. Empirically, we demonstrate that our proposed method, when combined with existing safe RL algorithms, can match the task reward achieved by our baselines with significantly fewer constraint violations.
NUQ: Nonparametric Uncertainty Quantification for Deterministic Neural Networks
Kotelevskii, Nikita, Artemenkov, Aleksandr, Fedyanin, Kirill, Noskov, Fedor, Fishkov, Alexander, Petiushko, Aleksandr, Panov, Maxim
This paper proposes a fast and scalable method for uncertainty quantification of machine learning models' predictions. First, we show the principled way to measure the uncertainty of predictions for a classifier based on Nadaraya-Watson's nonparametric estimate of the conditional label distribution. Importantly, the approach allows to disentangle explicitly aleatoric and epistemic uncertainties. The resulting method works directly in the feature space. However, one can apply it to any neural network by considering an embedding of the data induced by the network. We demonstrate the strong performance of the method in uncertainty estimation tasks on a variety of real-world image datasets, such as MNIST, SVHN, CIFAR-100 and several versions of ImageNet.
Smoothed Embeddings for Certified Few-Shot Learning
Pautov, Mikhail, Kuznetsova, Olesya, Tursynbek, Nurislam, Petiushko, Aleksandr, Oseledets, Ivan
Randomized smoothing is considered to be the state-of-the-art provable defense against adversarial perturbations. However, it heavily exploits the fact that classifiers map input objects to class probabilities and do not focus on the ones that learn a metric space in which classification is performed by computing distances to embeddings of classes prototypes. In this work, we extend randomized smoothing to few-shot learning models that map inputs to normalized embeddings. We provide analysis of Lipschitz continuity of such models and derive robustness certificate against $\ell_2$-bounded perturbations that may be useful in few-shot learning scenarios. Our theoretical results are confirmed by experiments on different datasets.
Many Heads but One Brain: an Overview of Fusion Brain Challenge on AI Journey 2021
Bakshandaeva, Daria, Dimitrov, Denis, Shonenkov, Alex, Potanin, Mark, Arkhipkin, Vladimir, Karachev, Denis, Davydova, Vera, Voronov, Anton, Martynov, Mikhail, Semenova, Natalia, Stepnov, Mikhail, Tutubalina, Elena, Chertok, Andrey, Petiushko, Aleksandr
Abstract--Supporting the current trend in the AI community, we propose the AI Journey 2021 Challenge called Fusion Brain which is targeted to make the universal architecture process different modalities (namely, images, texts, and code) and to solve multiple tasks for vision and language. We have created datasets for each task to test the participants' submissions on it. Moreover, we have opened a new handwritten dataset in both Russian and English, which consists of 94,128 pairs of images and texts. The Russian part of the dataset is the largest Russian handwritten dataset in the world. We also propose the baseline solution and corresponding task-specific solutions as well as overall metrics.
CC-Cert: A Probabilistic Approach to Certify General Robustness of Neural Networks
Pautov, Mikhail, Tursynbek, Nurislam, Munkhoeva, Marina, Muravev, Nikita, Petiushko, Aleksandr, Oseledets, Ivan
In safety-critical machine learning applications, it is crucial to defend models against adversarial attacks -- small modifications of the input that change the predictions. Besides rigorously studied $\ell_p$-bounded additive perturbations, recently proposed semantic perturbations (e.g. rotation, translation) raise a serious concern on deploying ML systems in real-world. Therefore, it is important to provide provable guarantees for deep learning models against semantically meaningful input transformations. In this paper, we propose a new universal probabilistic certification approach based on Chernoff-Cramer bounds that can be used in general attack settings. We estimate the probability of a model to fail if the attack is sampled from a certain distribution. Our theoretical findings are supported by experimental results on different datasets.
Manifold Hypothesis in Data Analysis: Double Geometrically-Probabilistic Approach to Manifold Dimension Estimation
Ivanov, Alexander, Nosovskiy, Gleb, Chekunov, Alexey, Fedoseev, Denis, Kibkalo, Vladislav, Nikulin, Mikhail, Popelenskiy, Fedor, Komkov, Stepan, Mazurenko, Ivan, Petiushko, Aleksandr
Manifold hypothesis states that data points in high-dimensional space actually lie in close vicinity of a manifold of much lower dimension. In many cases this hypothesis was empirically verified and used to enhance unsupervised and semi-supervised learning. Here we present new approach to manifold hypothesis checking and underlying manifold dimension estimation. In order to do it we use two very different methods simultaneously - one geometric, another probabilistic - and check whether they give the same result. Our geometrical method is a modification for sparse data of a well-known box-counting algorithm for Minkowski dimension calculation. The probabilistic method is new. Although it exploits standard nearest neighborhood distance, it is different from methods which were previously used in such situations. This method is robust, fast and includes special preliminary data transformation. Experiments on real datasets show that the suggested approach based on two methods combination is powerful and effective.
Quadric hypersurface intersection for manifold learning in feature space
Pavutnitskiy, Fedor, Ivanov, Sergei O., Abramov, Evgeny, Borovitskiy, Viacheslav, Klochkov, Artem, Vialov, Viktor, Zaikovskii, Anatolii, Petiushko, Aleksandr
The knowledge that data lies close to a particular submanifold of the ambient Euclidean space may be useful in a number of ways. For instance, one may want to automatically mark any point far away from the submanifold as an outlier, or to use its geodesic distance to measure similarity between points. Classical problems for manifold learning are often posed in a very high dimension, e.g. for spaces of images or spaces of representations of words. Today, with deep representation learning on the rise in areas such as computer vision and natural language processing, many problems of this kind may be transformed into problems of moderately high dimension, typically of the order of hundreds. Motivated by this, we propose a manifold learning technique suitable for moderately high dimension and large datasets. The manifold is learned from the training data in the form of an intersection of quadric hypersurfaces -- simple but expressive objects. At test time, this manifold can be used to introduce an outlier score for arbitrary new points and to improve a given similarity metric by incorporating learned geometric structure into it.