Goto

Collaborating Authors

 Banff


Latent Optimal Paths by Gumbel Propagation for Variational Bayesian Dynamic Programming

arXiv.org Artificial Intelligence

We propose a unified approach to obtain structured sparse optimal paths in the latent space of a variational autoencoder (VAE) using dynamic programming and Gumbel propagation. We solve the classical optimal path problem by a probability softening solution, called the stochastic optimal path, and transform a wide range of DP problems into directed acyclic graphs in which all possible paths follow a Gibbs distribution. We show the equivalence of the Gibbs distribution to a message-passing algorithm by the properties of the Gumbel distribution and give all the ingredients required for variational Bayesian inference. Our approach obtaining latent optimal paths enables end-to-end training for generative tasks in which models rely on the information of unobserved structural features. We validate the behavior of our approach and showcase its applicability in two real-world applications: text-to-speech and singing voice synthesis.


Deep Optimal Transport: A Practical Algorithm for Photo-realistic Image Restoration

arXiv.org Artificial Intelligence

We propose an image restoration algorithm that can control the perceptual quality and/or the mean square error (MSE) of any pre-trained model, trading one over the other at test time. Our algorithm is few-shot: Given about a dozen images restored by the model, it can significantly improve the perceptual quality and/or the MSE of the model for newly restored images without further training. Our approach is motivated by a recent theoretical result that links between the minimum MSE (MMSE) predictor and the predictor that minimizes the MSE under a perfect perceptual quality constraint. Specifically, it has been shown that the latter can be obtained by optimally transporting the output of the former, such that its distribution matches the source data. Thus, to improve the perceptual quality of a predictor that was originally trained to minimize MSE, we approximate the optimal transport by a linear transformation in the latent space of a variational auto-encoder, which we compute in closed-form using empirical means and covariances. Going beyond the theory, we find that applying the same procedure on models that were initially trained to achieve high perceptual quality, typically improves their perceptual quality even further. And by interpolating the results with the original output of the model, we can improve their MSE on the expense of perceptual quality. We illustrate our method on a variety of degradations applied to general content images of arbitrary dimensions.


Discrete-time Robust PD Controlled System with DOB/CDOB Compensation for High Speed Autonomous Vehicle Path Following

arXiv.org Artificial Intelligence

Autonomous vehicle path following performance is one of significant consideration. This paper presents discrete time design of robust PD controlled system with disturbance observer (DOB) and communication disturbance observer (CDOB) compensation to enhance autonomous vehicle path following performance. Although always implemented on digital devices, DOB and CDOB structure are usually designed in continuous time in the literature and also in our previous work. However, it requires high sampling rate for continuous-time design block diagram to automatically convert to corresponding discrete-time controller using rapid controller prototyping systems. In this paper, direct discrete time design is carried out. Digital PD feedback controller is designed based on the nominal plant using the proposed parameter space approach. Zero order hold method is applied to discretize the nominal plant, DOB and CDOB structure in continuous domain. Discrete time DOB is embedded into the steering to path following error loop for model regulation in the presence of uncertainty in vehicle parameters such as vehicle mass, vehicle speed and road-tire friction coefficient and rejecting external disturbance like crosswind force. On the other hand, time delay from CAN bus based sensor and actuator command interfaces results in degradation of system performance since large negative phase angles are added to the plant frequency response. Discrete time CDOB compensated control system can be used for time delay compensation where the accurate knowledge of delay time value is not necessary. A validated model of our lab Ford Fusion hybrid automated driving research vehicle is used for the simulation analysis while the vehicle is driving at high speed. Simulation results successfully demonstrate the improvement of autonomous vehicle path following performance with the proposed discrete time DOB and CDOB structure.


Cooperative Collision Avoidance in a Connected Vehicle Environment

arXiv.org Artificial Intelligence

Connected vehicle (CV) technology is among the most heavily researched areas in both the academia and industry. The vehicle to vehicle (V2V), vehicle to infrastructure (V2I) and vehicle to pedestrian (V2P) communication capabilities enable critical situational awareness. In some cases, these vehicle communication safety capabilities can overcome the shortcomings of other sensor safety capabilities because of external conditions such as 'No Line of Sight' (NLOS) or very harsh weather conditions. Connected vehicles will help cities and states reduce traffic congestion, improve fuel efficiency and improve the safety of the vehicles and pedestrians. On the road, cars will be able to communicate with one another, automatically transmitting data such as speed, position, and direction, and send alerts to each other if a crash seems imminent. The main focus of this paper is the implementation of Cooperative Collision Avoidance (CCA) for connected vehicles. It leverages the Vehicle to Everything (V2X) communication technology to create a real-time implementable collision avoidance algorithm along with decision-making for a vehicle that communicates with other vehicles. Four distinct collision risk environments are simulated on a cost effective Connected Autonomous Vehicle (CAV) Hardware in the Loop (HIL) simulator to test the overall algorithm in real-time with real electronic control and communication hardware.


Hardware-in-the-Loop and Road Testing of RLVW and GLOSA Connected Vehicle Applications

arXiv.org Artificial Intelligence

This paper presents an evaluation of two different Vehicle to Infrastructure (V2I) applications, namely Red Light Violation Warning (RLVW) and Green Light Optimized Speed Advisory (GLOSA). The evaluation method is to first develop and use Hardware-in-the-Loop (HIL) simulator testing, followed by extension of the HIL testing to road testing using an experimental connected vehicle. The HIL simulator used in the testing is a state-of-the-art simulator that consists of the same hardware like the road side unit and traffic cabinet as is used in real intersections and allows testing of numerous different traffic and intersection geometry and timing scenarios realistically. First, the RLVW V2I algorithm is tested in the HIL simulator and then implemented in an On-Board-Unit (OBU) in our experimental vehicle and tested at real world intersections. This same approach of HIL testing followed by testing in real intersections using our experimental vehicle is later extended to the GLOSA application. The GLOSA application that is tested in this paper has both an optimal speed advisory for passing at the green light and also includes a red light violation warning system. The paper presents the HIL and experimental vehicle evaluation systems, information about RLVW and GLOSA and HIL simulation and road testing results and their interpretations.


Multilingual Conceptual Coverage in Text-to-Image Models

arXiv.org Artificial Intelligence

We propose "Conceptual Coverage Across Languages" (CoCo-CroLa), a technique for benchmarking the degree to which any generative text-to-image system provides multilingual parity to its training language in terms of tangible nouns. For each model we can assess "conceptual coverage" of a given target language relative to a source language by comparing the population of images generated for a series of tangible nouns in the source language to the population of images generated for each noun under translation in the target language. This technique allows us to estimate how well-suited a model is to a target language as well as identify model-specific weaknesses, spurious correlations, and biases without a-priori assumptions. We demonstrate how it can be used to benchmark T2I models in terms of multilinguality, and how despite its simplicity it is a good proxy for impressive generalization.


Knowledge Graph Reasoning over Entities and Numerical Values

arXiv.org Artificial Intelligence

A complex logic query in a knowledge graph refers to a query expressed in logic form that conveys a complex meaning, such as where did the Canadian Turing award winner graduate from? Knowledge graph reasoning-based applications, such as dialogue systems and interactive search engines, rely on the ability to answer complex logic queries as a fundamental task. In most knowledge graphs, edges are typically used to either describe the relationships between entities or their associated attribute values. An attribute value can be in categorical or numerical format, such as dates, years, sizes, etc. However, existing complex query answering (CQA) methods simply treat numerical values in the same way as they treat entities. This can lead to difficulties in answering certain queries, such as which Australian Pulitzer award winner is born before 1927, and which drug is a pain reliever and has fewer side effects than Paracetamol. In this work, inspired by the recent advances in numerical encoding and knowledge graph reasoning, we propose numerical complex query answering. In this task, we introduce new numerical variables and operations to describe queries involving numerical attribute values. To address the difference between entities and numerical values, we also propose the framework of Number Reasoning Network (NRN) for alternatively encoding entities and numerical values into separate encoding structures. During the numerical encoding process, NRN employs a parameterized density function to encode the distribution of numerical values. During the entity encoding process, NRN uses established query encoding methods for the original CQA problem. Experimental results show that NRN consistently improves various query encoding methods on three different knowledge graphs and achieves state-of-the-art results.


On the Effectiveness of Hybrid Mutual Information Estimation

arXiv.org Artificial Intelligence

Estimating the mutual information from samples from a joint distribution is a challenging problem in both science and engineering. In this work, we realize a variational bound that generalizes both discriminative and generative approaches. Using this bound, we propose a hybrid method to mitigate their respective shortcomings. Further, we propose Predictive Quantization (PQ): a simple generative method that can be easily combined with discriminative estimators for minimal computational overhead. Our propositions yield a tighter bound on the information thanks to the reduced variance of the estimator. We test our methods on a challenging task of correlated high-dimensional Gaussian distributions and a stochastic process involving a system of free particles subjected to a fixed energy landscape. Empirical results show that hybrid methods consistently improved mutual information estimates when compared to the corresponding discriminative counterpart.


UKP-SQuARE: An Interactive Tool for Teaching Question Answering

arXiv.org Artificial Intelligence

The exponential growth of question answering (QA) has made it an indispensable topic in any Natural Language Processing (NLP) course. Additionally, the breadth of QA derived from this exponential growth makes it an ideal scenario for teaching related NLP topics such as information retrieval, explainability, and adversarial attacks among others. In this paper, we introduce UKP-SQuARE as a platform for QA education. This platform provides an interactive environment where students can run, compare, and analyze various QA models from different perspectives, such as general behavior, explainability, and robustness. Therefore, students can get a first-hand experience in different QA techniques during the class. Thanks to this, we propose a learner-centered approach for QA education in which students proactively learn theoretical concepts and acquire problem-solving skills through interactive exploration, experimentation, and practical assignments, rather than solely relying on traditional lectures. To evaluate the effectiveness of UKP-SQuARE in teaching scenarios, we adopted it in a postgraduate NLP course and surveyed the students after the course. Their positive feedback shows the platform's effectiveness in their course and invites a wider adoption.


Multi-Level Knowledge Distillation for Out-of-Distribution Detection in Text

arXiv.org Artificial Intelligence

Self-supervised representation learning has proved to be a valuable component for out-of-distribution (OoD) detection with only the texts of in-distribution (ID) examples. These approaches either train a language model from scratch or fine-tune a pre-trained language model using ID examples, and then take the perplexity output by the language model as OoD scores. In this paper, we analyze the complementary characteristics of both OoD detection methods and propose a multi-level knowledge distillation approach that integrates their strengths while mitigating their limitations. Specifically, we use a fine-tuned model as the teacher to teach a randomly initialized student model on the ID examples. Besides the prediction layer distillation, we present a similarity-based intermediate layer distillation method to thoroughly explore the representation space of the teacher model. In this way, the learned student can better represent the ID data manifold while gaining a stronger ability to map OoD examples outside the ID data manifold with the regularization inherited from pre-training. Besides, the student model sees only ID examples during parameter learning, further promoting more distinguishable features for OoD detection. We conduct extensive experiments over multiple benchmark datasets, i.e., CLINC150, SST, ROSTD, 20 NewsGroups, and AG News; showing that the proposed method yields new state-of-the-art performance. We also explore its application as an AIGC detector to distinguish between answers generated by ChatGPT and human experts. It is observed that our model exceeds human evaluators in the pair-expert task on the Human ChatGPT Comparison Corpus.