AITopics

Offline Meta Reinforcement Learning - Identifiability Challenges and Effective Data Collection Strategies

Neural Information Processing SystemsMay-16-2025, 03:21:32 GMT

Consider the following instance of the Offline Meta Reinforcement Learning (OMRL) problem: given the complete training logs of N conventional RL agents, trained on N different tasks, design a meta-agent that can quickly maximize reward in a new, unseen task from the same task distribution. In particular, while each conventional RL agent explored and exploited its own different task, the meta-agent must identify regularities in the data that lead to effective exploration/exploitation in the unseen task. Here, we take a Bayesian RL (BRL) view, and seek to learn a Bayes-optimal policy from the offline data. Building on the recent VariBAD BRL approach, we develop an off-policy BRL method that learns to plan an exploration strategy based on an adaptive neural belief estimate. However, learning to infer such a belief from offline data brings a new identifiability issue we term MDP ambiguity. We characterize the problem, and suggest resolutions via data collection and modification procedures. Finally, we evaluate our framework on a diverse set of domains, including difficult sparse reward tasks, and demonstrate learning of effective exploration behavior that is qualitatively different from the exploration used by any RL agent in the data. Our code is available online at https://github.com/Rondorf/BOReL.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country: North America > United States (0.14)

Genre: Research Report > Experimental Study (0.46)

Industry: Energy > Oil & Gas > Upstream (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

1eeaae7c89d9484926db6974b6ece564-Paper-Conference.pdf

Neural Information Processing SystemsMay-16-2025, 02:45:57 GMT

artificial intelligence, generalization, machine learning, (17 more...)

Neural Information Processing Systems

Country:

Europe (0.94)
Asia > Middle East > Israel (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Data Science (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Supplementary Material to " Sufficient dimension reduction for classification using principal optimal transport direction "

Neural Information Processing SystemsMay-16-2025, 02:43:35 GMT

X. Without loss of generality, we assume S(B) = S Hence, to prove Theorem 1, it is sufficient to show that S(B) = S(Σ) holds. To verify S(B) = S(Σ), we only need to show the following two results hold: (I). We now begin with the statement (I). This completes the proof for Statement I. We then turn to Statement II. This leads to a contradiction with (H.2) where the structure dimension is r.

artificial intelligence, machine learning, measure-preserving map, (12 more...)

Neural Information Processing Systems

Country:

North America > Canada (0.14)
Asia > China (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning in High Dimensional Spaces (0.41)

Add feedback

239f914f30ea3c948fce2ea07a9efb33-Paper.pdf

Neural Information Processing SystemsMay-16-2025, 02:42:58 GMT

artificial intelligence, machine learning, natural language, (16 more...)

Neural Information Processing Systems

Country: North America > United States (0.46)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.70)

Add feedback

Appendix A Assessing Conditional Independence/Dependence in CIFAR-10H and Imagenet-16H Datasets

Neural Information Processing SystemsMay-16-2025, 02:30:40 GMT

We investigate the degree to which our conditional independence assumption is satisfied empirically in the datasets used in the paper. Specifically, of interest is the assumption of conditional independence of m(x) and h(x), given y. Assessing conditional independence is not straightforward given that m(x) is a K-dimensional real-valued vector and h(x) and y each take one of K categorical values, with K = 10 for CIFAR-10H and K = 16 for ImageNet-16H. While there exist statistical tests for assessing conditional independence for categorical random variables, with real-valued variables the situation is less straightforward and there are multiple options such as different non-parametric tests involving different tradeoffs [Runge, 2018, Marx and Vreeken, 2019, Mukherjee et al., 2020, Berrett et al., 2020]. Given these issues we investigate the degree of conditional dependence using two relatively simple approaches.

artificial intelligence, imagenet-16h, machine learning, (16 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Combining Human Predictions with Model Probabilities via Confusion Matrices and Calibration Gavin Kerrigan 1 Mark Steyvers Department of Computer Science

Neural Information Processing SystemsMay-16-2025, 02:30:36 GMT

An increasingly common use case for machine learning models is augmenting the abilities of human decision makers. For classification tasks where neither the human nor model are perfectly accurate, a key step in obtaining high performance is combining their individual predictions in a manner that leverages their relative strengths. In this work, we develop a set of algorithms that combine the probabilistic output of a model with the class-level output of a human. We show theoretically that the accuracy of our combination model is driven not only by the individual human and model accuracies, but also by the model's confidence. Empirical results on image classification with CIFAR-10 and a subset of ImageNet demonstrate that such human-model combinations consistently have higher accuracies than the model or human alone, and that the parameters of the combination method can be estimated effectively with as few as ten labeled datapoints.

artificial intelligence, machine learning, prediction, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > California (0.28)
North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report > New Finding (0.68)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (0.67)
Health & Medicine > Therapeutic Area (0.46)

Add feedback

1d774c112926348c3e25ea47d87c835b-Paper-Conference.pdf

Neural Information Processing SystemsMay-16-2025, 01:53:57 GMT

data mining, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country: Asia > China (0.14)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.93)
Information Technology > Sensing and Signal Processing (0.93)

Add feedback

27b587bbe83aecf9a98c8fe6ab48cacc-Supplemental.pdf

Neural Information Processing SystemsMay-16-2025, 01:42:48 GMT

artificial intelligence, machine learning, relaxation, (17 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre: Research Report (0.67)

Industry: Information Technology (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

272e11700558e27be60f7489d2d782e7-Paper.pdf

Neural Information Processing SystemsMay-16-2025, 01:28:29 GMT

artificial intelligence, iteration, machine learning, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Wisconsin > Dane County > Madison (0.14)
North America > Canada > Ontario > Toronto (0.14)

Industry:

Energy > Oil & Gas > Upstream (1.00)
Health & Medicine (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.97)

Add feedback

Fair Canonical Correlation Analysis Zhuoping Zhou

Neural Information Processing SystemsMay-16-2025, 01:15:50 GMT

This paper investigates fairness and bias in Canonical Correlation Analysis (CCA), a widely used statistical technique for examining the relationship between two sets of variables. We present a framework that alleviates unfairness by minimizing the correlation disparity error associated with protected attributes. Our approach enables CCA to learn global projection matrices from all data points while ensuring that these matrices yield comparable correlation levels to group-specific projection matrices. Experimental evaluation on both synthetic and real-world datasets demonstrates the efficacy of our method in reducing correlation disparity error without compromising CCA accuracy.

artificial intelligence, machine learning, sf-cca, (15 more...)

Neural Information Processing Systems

Country: