AITopics | Banff

Collaborating Authors

Banff

On The Fragility of Learned Reward Functions

McKinney, Lev, Duan, Yawen, Krueger, David, Gleave, Adam

arXiv.org Artificial IntelligenceJan-9-2023

Reward functions are notoriously difficult to specify, especially for tasks with complex goals. Reward learning approaches attempt to infer reward functions from human feedback and preferences. Prior works on reward learning have mainly focused on the performance of policies trained alongside the reward function. This practice, however, may fail to detect learned rewards that are not capable of training new policies from scratch and thus do not capture the intended behavior. Our work focuses on demonstrating and studying the causes of these relearning failures in the domain of preference-based reward learning. We demonstrate with experiments in tabular and continuous control environments that the severity of relearning failures can be sensitive to changes in reward model design and the trajectory dataset composition. Based on our findings, we emphasize the need for more retraining-based evaluations in the literature.

artificial intelligence, machine learning, reinforcement learning, (13 more...)

arXiv.org Artificial Intelligence

2301.03652

Country:

North America > Canada > Ontario > Toronto (0.28)
North America > United States > California > Los Angeles County > Long Beach (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
(15 more...)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Sharper Analysis for Minibatch Stochastic Proximal Point Methods: Stability, Smoothness, and Deviation

Yuan, Xiao-Tong, Li, Ping

arXiv.org Artificial IntelligenceJan-8-2023

The stochastic proximal point (SPP) methods have gained recent attention for stochastic optimization, with strong convergence guarantees and superior robustness to the classic stochastic gradient descent (SGD) methods showcased at little to no cost of computational overhead added. In this article, we study a minibatch variant of SPP, namely M-SPP, for solving convex composite risk minimization problems. The core contribution is a set of novel excess risk bounds of M-SPP derived through the lens of algorithmic stability theory. Particularly under smoothness and quadratic growth conditions, we show that M-SPP with minibatch-size $n$ and iteration count $T$ enjoys an in-expectation fast rate of convergence consisting of an $\mathcal{O}\left(\frac{1}{T^2}\right)$ bias decaying term and an $\mathcal{O}\left(\frac{1}{nT}\right)$ variance decaying term. In the small-$n$-large-$T$ setting, this result substantially improves the best known results of SPP-type approaches by revealing the impact of noise level of model on convergence rate. In the complementary small-$T$-large-$n$ regime, we provide a two-phase extension of M-SPP to achieve comparable convergence rates. Moreover, we derive a near-tight high probability (over the randomness of data) bound on the parameter estimation error of a sampling-without-replacement variant of M-SPP. Numerical evidences are provided to support our theoretical predictions when substantialized to Lasso and logistic regression models.

artificial intelligence, m-spp, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2301.03125

Country:

North America > United States > New York > New York County > New York City (0.14)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
North America > Canada > Quebec > Montreal (0.04)
(23 more...)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.55)

Add feedback

Vehicle in Virtual Environment (VVE) Method

Gelbal, Sukru Yaren, Aksun-Guvenc, Bilin, Guvenc, Levent

arXiv.org Artificial IntelligenceDec-21-2022

Autonomous vehicle (AV) algorithms need to be tested extensively in order to make sure the vehicle and the passengers will be safe while using it after the implementation. Testing these algorithms in real world create another important safety critical point. Real world testing is also subjected to limitations such as logistic limitations to carry or drive the vehicle to a certain location. For this purpose, hardware in the loop (HIL) simulations as well as virtual environments such as CARLA and LG SVL are used widely. This paper discusses a method that combines the real vehicle with the virtual world, called vehicle in virtual environment (VVE). This method projects the vehicle location and heading into a virtual world for desired testing, and transfers back the information from sensors in the virtual world to the vehicle. As a result, while vehicle is moving in the real world, it simultaneously moves in the virtual world and obtains the situational awareness via multiple virtual sensors. This would allow testing in a safe environment with the real vehicle while providing some additional benefits on vehicle dynamics fidelity, logistics limitations and passenger experience testing. The paper also demonstrates an example case study where path following and the virtual sensors are utilized to test a radar based stopping algorithm.

artificial intelligence, human computer interaction, vehicle, (16 more...)

arXiv.org Artificial Intelligence

2212.11469

Country:

North America > United States > Ohio > Franklin County > Columbus (0.14)
Europe > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.05)
Asia > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.05)
(5 more...)

Genre: Research Report (1.00)

Industry:

Transportation > Ground > Road (1.00)
Automobiles & Trucks (1.00)
Transportation > Passenger (0.88)

Technology:

Information Technology > Human Computer Interaction > Interfaces (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)

Add feedback

GD-VAEs: Geometric Dynamic Variational Autoencoders for Learning Nonlinear Dynamics and Dimension Reductions

Lopez, Ryan, Atzberger, Paul J.

arXiv.org Artificial IntelligenceDec-19-2022

We develop data-driven methods incorporating geometric and topological information to learn parsimonious representations of nonlinear dynamics from observations. We develop approaches for learning nonlinear state space models of the dynamics for general manifold latent spaces using training strategies related to Variational Autoencoders (VAEs). Our methods are referred to as Geometric Dynamic (GD) Variational Autoencoders (GD-VAEs). We learn encoders and decoders for the system states and evolution based on deep neural network architectures that include general Multilayer Perceptrons (MLPs), Convolutional Neural Networks (CNNs), and Transpose CNNs (T-CNNs). Motivated by problems arising in parameterized PDEs and physics, we investigate the performance of our methods on tasks for learning low dimensional representations of the nonlinear Burgers equations, constrained mechanical systems, and spatial fields of reaction-diffusion systems. GD-VAEs provide methods for obtaining representations for use in diverse learning tasks involving dynamics.

artificial intelligence, machine learning, representation, (19 more...)

arXiv.org Artificial Intelligence

2206.05183

Country:

North America > United States > California > Santa Barbara County > Santa Barbara (0.14)
North America > United States > Colorado > Denver County > Denver (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
(7 more...)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

ASAT: Adaptively Scaled Adversarial Training in Time Series

Zhang, Zhiyuan, Li, Wei, Bao, Ruihan, Harimoto, Keiko, Wu, Yunfang, Sun, Xu

arXiv.org Artificial IntelligenceDec-19-2022

Adversarial training is a method for enhancing neural networks to improve the robustness against adversarial examples. Besides the security concerns of potential adversarial examples, adversarial training can also improve the generalization ability of neural networks, train robust neural networks, and provide interpretability for neural networks. In this work, we introduce adversarial training in time series analysis to enhance the neural networks for better generalization ability by taking the finance field as an example. Rethinking existing research on adversarial training, we propose the adaptively scaled adversarial training (ASAT) in time series analysis, by rescaling data at different time slots with adaptive scales. Experimental results show that the proposed ASAT can improve both the generalization ability and the adversarial robustness of neural networks compared to the baselines. Compared to the traditional adversarial training algorithm, ASAT can achieve better generalization ability and similar adversarial robustness.

adversarial training, artificial intelligence, machine learning, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.neucom.2022.12.013

2108.08976

Country:

North America > United States > California > Los Angeles County > Long Beach (0.14)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.14)
North America > United States > Nevada > Clark County > Las Vegas (0.04)
(15 more...)

Genre: Research Report > New Finding (0.48)

Industry:

Banking & Finance > Trading (0.67)
Information Technology (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Speed up the inference of diffusion models via shortcut MCMC sampling

Chen, Gang

arXiv.org Artificial IntelligenceDec-18-2022

Diffusion probabilistic models have generated high quality image synthesis recently. However, one pain point is the notorious inference to gradually obtain clear images with thousands of steps, which is time consuming compared to other generative models. In this paper, we present a shortcut MCMC sampling algorithm, which balances training and inference, while keeping the generated data's quality. In particular, we add the global fidelity constraint with shortcut MCMC sampling to combat the local fitting from diffusion models. We do some initial experiments and show very promising results.

artificial intelligence, diffusion model, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2301.01206

Country:

North America > United States > New York > Erie County > Buffalo (0.04)
North America > Canada > Alberta > Census Division No. 15 > Improvement District No. 9 > Banff (0.04)
Europe > France > Hauts-de-France > Nord > Lille (0.04)
Europe > Austria (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.85)

Add feedback

Distribution-aware Goal Prediction and Conformant Model-based Planning for Safe Autonomous Driving

Francis, Jonathan, Chen, Bingqing, Yao, Weiran, Nyberg, Eric, Oh, Jean

arXiv.org Artificial IntelligenceDec-16-2022

The feasibility of collecting a large amount of expert demonstrations has inspired growing research interests in learning-to-drive settings, where models learn by imitating the driving behaviour from experts. However, exclusively relying on imitation can limit agents' generalisability to novel scenarios that are outside the support of the training data. In this paper, we address this challenge by factorising the driving task, based on the intuition that modular architectures are more generalisable and more robust to changes in the environment compared to monolithic, end-to-end frameworks. Specifically, we draw inspiration from the trajectory forecasting community and reformulate the learning-to-drive task as obstacle-aware perception and grounding, distribution-aware goal prediction, and model-based planning. Firstly, we train the obstacle-aware perception module to extract salient representation of the visual context. Then, we learn a multi-modal goal distribution by performing conditional density-estimation using normalising flow. Finally, we ground candidate trajectory predictions road geometry, and plan the actions based on on vehicle dynamics. Under the CARLA simulator, we report state-of-the-art results on the CARNOVEL benchmark.

artificial intelligence, machine learning, prediction and conformant model-based planning, (11 more...)

arXiv.org Artificial Intelligence

2212.08729

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
North America > United States > Maryland > Baltimore (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
North America > Canada > Alberta > Census Division No. 15 > Improvement District No. 9 > Banff (0.04)

Genre: Research Report (0.70)

Industry:

Transportation > Ground > Road (1.00)
Automobiles & Trucks (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.84)

Add feedback

RWEN-TTS: Relation-aware Word Encoding Network for Natural Text-to-Speech Synthesis

Oh, Shinhyeok, Noh, HyeongRae, Hong, Yoonseok, Oh, Insoo

arXiv.org Artificial IntelligenceDec-15-2022

With the advent of deep learning, a huge number of text-to-speech (TTS) models which produce human-like speech have emerged. Recently, by introducing syntactic and semantic information w.r.t the input text, various approaches have been proposed to enrich the naturalness and expressiveness of TTS models. Although these strategies showed impressive results, they still have some limitations in utilizing language information. First, most approaches only use graph networks to utilize syntactic and semantic information without considering linguistic features. Second, most previous works do not explicitly consider adjacent words when encoding syntactic and semantic information, even though it is obvious that adjacent words are usually meaningful when encoding the current word. To address these issues, we propose Relation-aware Word Encoding Network (RWEN), which effectively allows syntactic and semantic information based on two modules (i.e., Semantic-level Relation Encoding and Adjacent Word Relation Encoding). Experimental results show substantial improvements compared to previous works.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2212.07939

Country:

Asia > South Korea > Incheon > Incheon (0.04)
Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
(16 more...)

Genre: Research Report > New Finding (0.88)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Synthesis (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Alternating Objectives Generates Stronger PGD-Based Adversarial Attacks

Antoniou, Nikolaos, Georgiou, Efthymios, Potamianos, Alexandros

arXiv.org Artificial IntelligenceDec-15-2022

Designing powerful adversarial attacks is of paramount importance for the evaluation of $\ell_p$-bounded adversarial defenses. Projected Gradient Descent (PGD) is one of the most effective and conceptually simple algorithms to generate such adversaries. The search space of PGD is dictated by the steepest ascent directions of an objective. Despite the plethora of objective function choices, there is no universally superior option and robustness overestimation may arise from ill-suited objective selection. Driven by this observation, we postulate that the combination of different objectives through a simple loss alternating scheme renders PGD more robust towards design choices. We experimentally verify this assertion on a synthetic-data example and by evaluating our proposed method across 25 different $\ell_{\infty}$-robust models and 3 datasets. The performance improvement is consistent, when compared to the single loss counterparts. In the CIFAR-10 dataset, our strongest adversarial attack outperforms all of the white-box components of AutoAttack (AA) ensemble, as well as the most powerful attacks existing on the literature, achieving state-of-the-art results in the computational budget of our study ($T=100$, no restarts).

artificial intelligence, machine learning, objective, (17 more...)

arXiv.org Artificial Intelligence

2212.07992

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
(11 more...)

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military (0.82)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.66)

Add feedback

3rd Continual Learning Workshop Challenge on Egocentric Category and Instance Level Object Understanding

Pellegrini, Lorenzo, Zhu, Chenchen, Xiao, Fanyi, Yan, Zhicheng, Carta, Antonio, De Lange, Matthias, Lomonaco, Vincenzo, Sumbaly, Roshan, Rodriguez, Pau, Vazquez, David

arXiv.org Artificial IntelligenceDec-13-2022

Continual Learning, also known as Lifelong or Incremental Learning, has recently gained renewed interest among the Artificial Intelligence research community. Recent research efforts have quickly led to the design of novel algorithms able to reduce the impact of the catastrophic forgetting phenomenon in deep neural networks. Due to this surge of interest in the field, many competitions have been held in recent years, as they are an excellent opportunity to stimulate research in promising directions. This paper summarizes the ideas, design choices, rules, and results of the challenge held at the 3rd Continual Learning in Computer Vision (CLVision) Workshop at CVPR 2022. The focus of this competition is the complex continual object detection task, which is still underexplored in literature compared to classification tasks. The challenge is based on the challenge version of the novel EgoObjects dataset, a large-scale egocentric object dataset explicitly designed to benchmark continual learning algorithms for egocentric category-/instance-level object understanding, which covers more than 1k unique main objects and 250+ categories in around 100k video frames.

artificial intelligence, machine learning, proceedings, (13 more...)

arXiv.org Artificial Intelligence

2212.06833

Country:

North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > Canada > Alberta > Census Division No. 15 > Improvement District No. 9 > Banff (0.04)
Europe > Italy > Emilia-Romagna > Metropolitan City of Bologna > Bologna (0.04)
Europe > Belgium > Flanders > Flemish Brabant > Leuven (0.04)

Genre: Research Report (0.64)

Industry: Education > Educational Setting (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback