AITopics

2407.07966

Country:

North America > United States > Texas > Denton County > Denton (0.14)
North America > United States > Massachusetts > Hampshire County > Amherst (0.14)
Europe > Ukraine (0.04)
(21 more...)

Genre:

Research Report > Promising Solution (1.00)
Overview (1.00)

Industry:

Government > Military > Cyberwarfare (1.00)
Energy > Power Industry (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
(5 more...)

Schulte, Oliver, Poupart, Pascal

Why Online Reinforcement Learning is Causal

arXiv.org Artificial IntelligenceJul-10-2024

Reinforcement learning (RL) and causal modelling naturally complement each other. The goal of causal modelling is to predict the effects of interventions in an environment, while the goal of reinforcement learning is to select interventions that maximize the rewards the agent receives from the environment. Reinforcement learning includes the two most powerful sources of information for estimating causal relationships: temporal ordering and the ability to act on an environment. This paper examines which reinforcement learning settings we can expect to benefit from causal modelling, and how. In online learning, the agent has the ability to interact directly with their environment, and learn from exploring it. Our main argument is that in online learning, conditional probabilities are causal, and therefore offline RL is the setting where causal learning has the most potential to make a difference. Essentially, the reason is that when an agent learns from their {\em own} experience, there are no unobserved confounders that influence both the agent's own exploratory actions and the rewards they receive. Our paper formalizes this argument. For offline RL, where an agent may and typically does learn from the experience of {\em others}, we describe previous and new methods for leveraging a causal model, including support for counterfactual queries.

agent, conditional probability, probability, (15 more...)

2403.04221

Country:

North America > United States > Indiana > Tippecanoe County > West Lafayette (0.04)
North America > United States > Indiana > Tippecanoe County > Lafayette (0.04)
North America > Canada > Ontario > Waterloo Region > Waterloo (0.04)
(3 more...)

Genre:

Research Report (1.00)
Instructional Material > Online (0.40)

Industry:

Education (0.68)
Leisure & Entertainment > Sports > Hockey (0.68)
Leisure & Entertainment > Games (0.67)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.94)

Auras, Alexander, Gandikota, Kanchana Vaishnavi, Droege, Hannah, Moeller, Michael

Robustness and Exploration of Variational and Machine Learning Approaches to Inverse Problems: An Overview

This paper provides an overview of current approaches for solving inverse problems in imaging using variational methods and machine learning. A special focus lies on point estimators and their robustness against adversarial perturbations. In this context results of numerical experiments for a one-dimensional toy problem are provided, showing the robustness of different approaches and empirically verifying theoretical guarantees. Another focus of this review is the exploration of the subspace of data-consistent solutions through explicit guidance to satisfy specific semantic or textural properties.

artificial intelligence, inverse problem, machine learning, (17 more...)

2402.12072

Country:

Europe > Germany (0.04)
North America > Canada > Quebec > Montreal (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)
Asia > Middle East > Jordan (0.04)

Genre:

Overview (1.00)
Research Report > New Finding (0.67)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Factored Conditional Filtering: Tracking States and Estimating Parameters in High-Dimensional Spaces

Chen, Dawei, Yang-Zhao, Samuel, Lloyd, John, Ng, Kee Siong

This paper introduces factored conditional filters, new filtering algorithms for simultaneously tracking states and estimating parameters in high-dimensional state spaces. The conditional nature of the algorithms is used to estimate parameters and the factored nature is used to decompose the state space into low-dimensional subspaces in such a way that filtering on these subspaces gives distributions whose product is a good approximation to the distribution on the entire state space. The conditions for successful application of the algorithms are that observations be available at the subspace level and that the transition model can be factored into local transition models that are approximately confined to the subspaces; these conditions are widely satisfied in computer science, engineering, and geophysical filtering applications. We give experimental results on tracking epidemics and estimating parameters in large contact networks that show the effectiveness of our approach.

contact network, particle filter, transition model, (16 more...)

2206.02178

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > Italy (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
(4 more...)

Genre: Research Report (0.81)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Epidemiology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.92)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)

Robust Reinforcement Learning from Corrupted Human Feedback

Bukharin, Alexander, Hong, Ilgee, Jiang, Haoming, Li, Zichong, Zhang, Qingru, Zhang, Zixuan, Zhao, Tuo

Reinforcement learning from human feedback (RLHF) provides a principled framework for aligning AI systems with human preference data. For various reasons, e.g., personal bias, context ambiguity, lack of training, etc, human annotators may give incorrect or inconsistent preference labels. To tackle this challenge, we propose a robust RLHF approach -- $R^3M$, which models the potentially corrupted preference label as sparse outliers. Accordingly, we formulate the robust reward learning as an $\ell_1$-regularized maximum likelihood estimation problem. Computationally, we develop an efficient alternating optimization algorithm, which only incurs negligible computational overhead compared with the standard RLHF approach. Theoretically, we prove that under proper regularity conditions, $R^3M$ can consistently learn the underlying reward and identify outliers, provided that the number of outlier labels scales sublinearly with the preference sample size. Furthermore, we remark that $R^3M$ is versatile and can be extended to various preference optimization methods, including direct preference optimization (DPO). Our experiments on robotic control and natural language generation with large language models (LLMs) show that $R^3M$ improves robustness of the reward against several types of perturbations to the preference data.

arxiv preprint arxiv, noise model, preference data, (15 more...)

2406.15568

Country: Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Top-K Pairwise Ranking: Bridging the Gap Among Ranking-Based Measures for Multi-Label Classification

Wang, Zitai, Xu, Qianqian, Yang, Zhiyong, Wen, Peisong, He, Yuan, Cao, Xiaochun, Huang, Qingming

Multi-label ranking, which returns multiple top-ranked labels for each instance, has a wide range of applications for visual tasks. Due to its complicated setting, prior arts have proposed various measures to evaluate model performances. However, both theoretical analysis and empirical observations show that a model might perform inconsistently on different measures. To bridge this gap, this paper proposes a novel measure named Top-K Pairwise Ranking (TKPR), and a series of analyses show that TKPR is compatible with existing ranking-based measures. In light of this, we further establish an empirical surrogate risk minimization framework for TKPR. On one hand, the proposed framework enjoys convex surrogate losses with the theoretical support of Fisher consistency. On the other hand, we establish a sharp generalization bound for the proposed framework based on a novel technique named data-dependent contraction. Finally, empirical results on benchmark datasets validate the effectiveness of the proposed framework.

generalization, ranking loss, tkpr, (15 more...)

doi: 10.1007/s11263-024-02157-w

2407.06709

Country:

Asia > China > Beijing > Beijing (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)
(3 more...)

Genre: Research Report > Promising Solution (0.48)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)
(4 more...)

Ibrahim, Shahana, Traganitis, Panagiotis A., Fu, Xiao, Giannakis, Georgios B.

Learning From Crowdsourced Noisy Labels: A Signal Processing Perspective

One of the primary catalysts fueling advances in artificial intelligence (AI) and machine learning (ML) is the availability of massive, curated datasets. A commonly used technique to curate such massive datasets is crowdsourcing, where data are dispatched to multiple annotators. The annotator-produced labels are then fused to serve downstream learning and inference tasks. This annotation process often creates noisy labels due to various reasons, such as the limited expertise, or unreliability of annotators, among others. Therefore, a core objective in crowdsourcing is to develop methods that effectively mitigate the negative impact of such label noise on learning tasks. This feature article introduces advances in learning from noisy crowdsourced labels. The focus is on key crowdsourcing models and their methodological treatments, from classical statistical models to recent deep learning-based approaches, emphasizing analytical insights and algorithmic developments. In particular, this article reviews the connections between signal processing (SP) theory and methods, such as identifiability of tensor and nonnegative matrix factorization, and novel, principled solutions of longstanding challenges in crowdsourcing -- showing how SP perspectives drive the advancements of this field. Furthermore, this article touches upon emerging topics that are critical for developing cutting-edge AI/ML systems, such as crowdsourcing in reinforcement learning with human feedback (RLHF) and direct preference optimization (DPO) that are key techniques for fine-tuning large language models (LLMs).

algorithm, annotator, ds model, (13 more...)

2407.06902

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.28)
North America > United States > Florida > Orange County > Orlando (0.14)
North America > United States > Oregon > Benton County > Corvallis (0.04)
(3 more...)

Genre:

Research Report (1.00)
Overview (1.00)

Industry: Information Technology (0.93)

Technology:

Information Technology > Communications > Social Media > Crowdsourcing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
(2 more...)

Riefolo, Christian, Fanizzi, Nicola, d'Amato, Claudia

Simple and Interpretable Probabilistic Classifiers for Knowledge Graphs

Tackling the problem of learning probabilistic classifiers from incomplete data in the context of Knowledge Graphs expressed in Description Logics, we describe an inductive approach based on learning simple belief networks. Specifically, we consider a basic probabilistic model, a Naive Bayes classifier, based on multivariate Bernoullis and its extension to a two-tier network in which this classification model is connected to a lower layer consisting of a mixture of Bernoullis. We show how such models can be converted into (probabilistic) axioms (or rules) thus ensuring more interpretability. Moreover they may be also initialized exploiting expert knowledge. We present and discuss the outcomes of an empirical evaluation which aimed at testing the effectiveness of the models on a number of random classification problems with different ontologies.

ber, experiment, simple and interpretable probabilistic classifier, (11 more...)

2407.07045

Country:

Asia > Middle East > Jordan (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Learning to Complement and to Defer to Multiple Users

Zhang, Zheng, Ai, Wenjie, Wells, Kevin, Rosewarne, David, Do, Thanh-Toan, Carneiro, Gustavo

With the development of Human-AI Collaboration in Classification (HAI-CC), integrating users and AI predictions becomes challenging due to the complex decision-making process. This process has three options: 1) AI autonomously classifies, 2) learning to complement, where AI collaborates with users, and 3) learning to defer, where AI defers to users. Despite their interconnected nature, these options have been studied in isolation rather than as components of a unified system. In this paper, we address this weakness with the novel HAI-CC methodology, called Learning to Complement and to Defer to Multiple Users (LECODU). LECODU not only combines learning to complement and learning to defer strategies, but it also incorporates an estimation of the optimal number of users to engage in the decision process. The training of LECODU maximises classification accuracy and minimises collaboration costs associated with user involvement. Comprehensive evaluations across real-world and synthesized datasets demonstrate LECODU's superior performance compared to state-of-the-art HAI-CC methods. Remarkably, even when relying on unreliable users with high rates of label noise, LECODU exhibits significant improvement over both human decision-makers alone and AI alone.

accuracy, learning, lecodu, (12 more...)

2407.07003

Country:

North America > United States (0.14)
Oceania > Australia (0.04)
Europe > United Kingdom > England > Surrey (0.04)
Europe > Italy > Apulia > Bari (0.04)

Genre: Research Report (0.64)

Industry: Health & Medicine > Diagnostic Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)
(2 more...)

Massoli, Fabio Valerio, Louizos, Christos, Behboodi, Arash

Variational Learning ISTA

arXiv.org Machine LearningJul-9-2024

Compressed sensing combines the power of convex optimization techniques with a sparsity-inducing prior on the signal space to solve an underdetermined system of equations. For many problems, the sparsifying dictionary is not directly given, nor its existence can be assumed. Besides, the sensing matrix can change across different scenarios. Addressing these issues requires solving a sparse representation learning problem, namely dictionary learning, taking into account the epistemic uncertainty of the learned dictionaries and, finally, jointly learning sparse representations and reconstructions under varying sensing matrix conditions. We address both concerns by proposing a variant of the LISTA architecture. First, we introduce Augmented Dictionary Learning ISTA (A-DLISTA), which incorporates an augmentation module to adapt parameters to the current measurement setup. Then, we propose to learn a distribution over dictionaries via a variational approach, dubbed Variational Learning ISTA (VLISTA). VLISTA exploits A-DLISTA as the likelihood model and approximates a posterior distribution over the dictionaries as part of an unfolded LISTA-based recovery algorithm. As a result, VLISTA provides a probabilistic way to jointly learn the dictionary distribution and the reconstruction algorithm with varying sensing matrices. We provide theoretical and experimental support for our architecture and show that our model learns calibrated uncertainties.

a-dlista, matrix, vlista, (16 more...)

arXiv.org Machine Learning

2407.06646

Country: North America > United States > New York > New York County > New York City (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)