AITopics

2405.09062

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > Canada > Alberta > Census Division No. 15 > Improvement District No. 9 > Banff (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
(3 more...)

Genre: Research Report > Experimental Study (0.32)

Industry:

Leisure & Entertainment (0.94)
Media > Music (0.68)
Health & Medicine > Health Care Technology (0.47)
(2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Ablett, Trevor, Chan, Bryan, Wang, Jayce Haoran, Kelly, Jonathan

Value-Penalized Auxiliary Control from Examples for Learning without Rewards or Demonstrations

arXiv.org Artificial IntelligenceJul-3-2024

Learning from examples of success is an appealing approach to reinforcement learning that eliminates many of the disadvantages of using hand-crafted reward functions or full expert-demonstration trajectories, both of which can be difficult to acquire, biased, or suboptimal. However, learning from examples alone dramatically increases the exploration challenge, especially for complex tasks. This work introduces value-penalized auxiliary control from examples (VPACE); we significantly improve exploration in example-based control by adding scheduled auxiliary control and examples of auxiliary tasks. Furthermore, we identify a value-calibration problem, where policy value estimates can exceed their theoretical limits based on successful data. We resolve this problem, which is exacerbated by learning auxiliary tasks, through the addition of an above-success-level value penalty. Across three simulated and one real robotic manipulation environment, and 21 different main tasks, we show that our approach substantially improves learning efficiency. Videos, code, and datasets are available at https://papers.starslab.ca/vpace.

auxiliary task, learning, reinforcement learning, (15 more...)

2407.03311

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > Canada > Ontario > Toronto (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
(18 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

arXiv.org Machine LearningJul-3-2024

A Geometric Framework for Adversarial Vulnerability in Machine Learning

Bell, Brian

This work starts with the intention of using mathematics to understand the intriguing vulnerability observed by ~\citet{szegedy2013} within artificial neural networks. Along the way, we will develop some novel tools with applications far outside of just the adversarial domain. We will do this while developing a rigorous mathematical framework to examine this problem. Our goal is to build out theory which can support increasingly sophisticated conjecture about adversarial attacks with a particular focus on the so called ``Dimpled Manifold Hypothesis'' by ~\citet{shamir2021dimpled}. Chapter one will cover the history and architecture of neural network architectures. Chapter two is focused on the background of adversarial vulnerability. Starting from the seminal paper by ~\citet{szegedy2013} we will develop the theory of adversarial perturbation and attack. Chapter three will build a theory of persistence that is related to Ricci Curvature, which can be used to measure properties of decision boundaries. We will use this foundation to make a conjecture relating adversarial attacks. Chapters four and five represent a sudden and wonderful digression that examines an intriguing related body of theory for spatial analysis of neural networks as approximations of kernel machines and becomes a novel theory for representing neural networks with bilinear maps. These heavily mathematical chapters will set up a framework and begin exploring applications of what may become a very important theoretical foundation for analyzing neural network learning with spatial and geometric information. We will conclude by setting up our new methods to address the conjecture from chapter 3 in continuing research.

adversarial example, kernel representation, neural information processing system, (14 more...)

2407.11029

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(17 more...)

Genre: Research Report > New Finding (0.45)

Industry:

Government > Military (0.87)
Information Technology > Security & Privacy (0.87)
Government > Regional Government > North America Government > United States Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

arXiv.org Machine LearningJul-2-2024

Structured Partial Stochasticity in Bayesian Neural Networks

Rochussen, Tommy

Bayesian neural network posterior distributions have a great number of modes that correspond to the same network function. The abundance of such modes can make it difficult for approximate inference methods to do their job. Recent work has demonstrated the benefits of partial stochasticity for approximate inference in Bayesian neural networks; inference can be less costly and performance can sometimes be improved. I propose a structured way to select the deterministic subset of weights that removes neuron permutation symmetries, and therefore the corresponding redundant posterior modes. With a drastically simplified posterior distribution, the performance of existing approximate inference schemes is found to be greatly improved.

neuron, permutation symmetry, symmetry, (9 more...)

2405.17666

Country: North America > Canada > Alberta > Census Division No. 15 > Improvement District No. 9 > Banff (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Fellaji, Mohammed, Pennerath, Frédéric

The Epistemic Uncertainty Hole: an issue of Bayesian Neural Networks

arXiv.org Machine LearningJul-2-2024

More precisely, we observe that the epistemic uncertainty In many applications of Machine Learning, optimizing collapses literally in the presence of large models and solely the performance metrics of the predictive model, sometimes also of little training data, while we expect the such as the accuracy, can result in overconfident interpretations exact opposite behaviour. This phenomenon, which we call of erroneous outcomes, and thus, hazardous decisions "epistemic uncertainty hole", is all the more problematic as in case of critical domains. Therefore, being able to map the it undermines the entire applicative potential of BDL, which model outputs to some uncertainty quantification metrics, if is based precisely on the use of epistemic uncertainty. As well calibrated, is essential from a decision making point of an example, we evaluate the practical consequences of this view. When dealing with Deep Learning models, Bayesian uncertainty hole on one of the main applications of BDL, Deep Learning (BDL) [11, 12, 18, 10, 2], i.e. the application namely the detection of out-of-distribution samples. of Bayesian inference to deep neural networks, appears to be one of the keys to estimate such well-calibrated uncertainties.

cifar10, epistemic uncertainty, neural network, (15 more...)

2407.01985

Country:

North America > United States > New York > New York County > New York City (0.14)
North America > United States > California (0.04)
North America > Canada > Alberta > Census Division No. 15 > Improvement District No. 9 > Banff (0.04)
(2 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)

Lim, Jen Ning, Johansen, Adam M.

Particle Semi-Implicit Variational Inference

arXiv.org Machine LearningJun-30-2024

Semi-implicit variational inference (SIVI) enriches the expressiveness of variational families by utilizing a kernel and a mixing distribution to hierarchically define the variational distribution. Existing SIVI methods parameterize the mixing distribution using implicit distributions, leading to intractable variational densities. As a result, directly maximizing the evidence lower bound (ELBO) is not possible and so, they resort to either: optimizing bounds on the ELBO, employing costly inner-loop Markov chain Monte Carlo runs, or solving minimax objectives. In this paper, we propose a novel method for SIVI called Particle Variational Inference (PVI) which employs empirical measures to approximate the optimal mixing distributions characterized as the minimizer of a natural free energy functional via a particle approximation of an Euclidean--Wasserstein gradient flow. This approach means that, unlike prior works, PVI can directly optimize the ELBO; furthermore, it makes no parametric assumption about the mixing distribution. Our empirical results demonstrate that PVI performs favourably against other SIVI methods across various tasks. Moreover, we provide a theoretical analysis of the behaviour of the gradient flow of a related free energy functional: establishing the existence and uniqueness of solutions as well as propagation of chaos results.

gradient flow, inequality, prop, (13 more...)

2407.00649

Country:

North America > Canada > Ontario > Toronto (0.14)
Asia > Middle East > Jordan (0.04)
North America > United States > New York (0.04)
(2 more...)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

arXiv.org Artificial IntelligenceJun-28-2024

External Model Motivated Agents: Reinforcement Learning for Enhanced Environment Sampling

Bhagat, Rishav, Balloch, Jonathan, Lin, Zhiyu, Kim, Julia, Riedl, Mark

Unlike reinforcement learning (RL) agents, humans remain capable multitaskers in changing environments. In spite of only experiencing the world through their own observations and interactions, people know how to balance focusing on tasks with learning about how changes may affect their understanding of the world. This is possible by choosing to solve tasks in ways that are interesting and generally informative beyond just the current task. Motivated by this, we propose an agent influence framework for RL agents to improve the adaptation efficiency of external models in changing environments without any changes to the agent's rewards. Our formulation is composed of two self-contained modules: interest fields and behavior shaping via interest fields. We implement an uncertainty-based interest field algorithm as well as a skill-sampling-based behavior-shaping algorithm to use in testing this framework. Our results show that our method outperforms the baselines in terms of external model adaptation on metrics that measure both efficiency and performance.

agent, external model, rollout, (13 more...)

2407.00264

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > Canada > Alberta > Census Division No. 15 > Municipal District of Bighorn No. 8 > Kananaskis (0.04)
North America > Canada > Alberta > Census Division No. 15 > Improvement District No. 9 > Banff (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Materials > Metals & Mining (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Vandenhirtz, Moritz, Laguna, Sonia, Marcinkevičs, Ričards, Vogt, Julia E.

Stochastic Concept Bottleneck Models

arXiv.org Artificial IntelligenceJun-27-2024

Concept Bottleneck Models (CBMs) have emerged as a promising interpretable method whose final prediction is based on intermediate, human-understandable concepts rather than the raw input. Through time-consuming manual interventions, a user can correct wrongly predicted concept values to enhance the model's downstream performance. We propose Stochastic Concept Bottleneck Models (SCBMs), a novel approach that models concept dependencies. In SCBMs, a single-concept intervention affects all correlated concepts, thereby improving intervention effectiveness. Unlike previous approaches that model the concept relations via an autoregressive structure, we introduce an explicit, distributional parameterization that allows SCBMs to retain the CBMs' efficient training and inference procedure. Additionally, we leverage the parameterization to derive an effective intervention strategy based on the confidence region. We show empirically on synthetic tabular and natural image datasets that our approach improves intervention effectiveness significantly.

accuracy, concept dependency, intervention, (13 more...)

2406.19272

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)
Europe > France (0.04)
(5 more...)

Genre: Research Report > Promising Solution (0.48)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.68)

Ngnawé, Jonas, Sahoo, Sabyasachi, Pequignot, Yann, Precioso, Frédéric, Gagné, Christian

Detecting Brittle Decisions for Free: Leveraging Margin Consistency in Deep Robust Classifiers

arXiv.org Artificial IntelligenceJun-26-2024

Despite extensive research on adversarial training strategies to improve robustness, the decisions of even the most robust deep learning models can still be quite sensitive to imperceptible perturbations, creating serious risks when deploying them for high-stakes real-world applications. While detecting such cases may be critical, evaluating a model's vulnerability at a per-instance level using adversarial attacks is computationally too intensive and unsuitable for real-time deployment scenarios. The input space margin is the exact score to detect non-robust samples and is intractable for deep neural networks. This paper introduces the concept of margin consistency -- a property that links the input space margins and the logit margins in robust models -- for efficient detection of vulnerable samples. First, we establish that margin consistency is a necessary and sufficient condition to use a model's logit margin as a score for identifying non-robust samples. Next, through comprehensive empirical analysis of various robustly trained models on CIFAR10 and CIFAR100 datasets, we show that they indicate strong margin consistency with a strong correlation between their input space margins and the logit margins. Then, we show that we can effectively use the logit margin to confidently detect brittle decisions with such models and accurately estimate robust accuracy on an arbitrarily large test set by estimating the input margins only on a small subset. Finally, we address cases where the model is not sufficiently margin-consistent by learning a pseudo-margin from the feature representation. Our findings highlight the potential of leveraging deep representations to efficiently assess adversarial vulnerability in deployment scenarios.

logit margin, margin consistency, robustness, (14 more...)

2406.18451

Country:

Asia > Middle East > Jordan (0.04)
North America > Canada > Quebec (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
(4 more...)

Genre: Research Report (0.70)

Industry: Information Technology > Security & Privacy (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)

arXiv.org Artificial IntelligenceJun-26-2024

Transferable Reward Learning by Dynamics-Agnostic Discriminator Ensemble

Luo, Fan-Ming, Cao, Xingchen, Qin, Rong-Jun, Yu, Yang

Recovering reward function from expert demonstrations is a fundamental problem in reinforcement learning. The recovered reward function captures the motivation of the expert. Agents can imitate experts by following these reward functions in their environment, which is known as apprentice learning. However, the agents may face environments different from the demonstrations, and therefore, desire transferable reward functions. Classical reward learning methods such as inverse reinforcement learning (IRL) or, equivalently, adversarial imitation learning (AIL), recover reward functions coupled with training dynamics, which are hard to be transferable. Previous dynamics-agnostic reward learning methods rely on assumptions such as that the reward function has to be state-only, restricting their applicability. In this work, we present a dynamics-agnostic discriminator-ensemble reward learning method (DARL) within the AIL framework, capable of learning both state-action and state-only reward functions. DARL achieves this by decoupling the reward function from training dynamics, employing a dynamics-agnostic discriminator on a latent space derived from the original state-action space. This latent space is optimized to minimize information on the dynamics. We moreover discover the policy-dependency issue of the AIL framework that reduces the transferability. DARL represents the reward function as an ensemble of discriminators during training to eliminate policy dependencies. Empirical studies on MuJoCo tasks with changed dynamics show that DARL better recovers the reward function and results in better imitation performance in transferred environments, handling both state-only and state-action reward scenarios.

discriminator, learning, reward function, (13 more...)

2206.00238

Country:

North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
Africa > Ethiopia > Addis Ababa > Addis Ababa (0.04)
(24 more...)

Genre: Research Report > New Finding (0.92)

Industry: Health & Medicine (0.45)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)