AITopics

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.51)

arXiv.org Machine LearningJan-30-2026

Best Arm Identification with LLM Judges and Limited Human

Ao, Ruicheng, Chen, Hongyu, Gao, Siyang, Li, Hanwei, Simchi-Levi, David

We study fixed-confidence best-arm identification (BAI) where a cheap but potentially biased proxy (e.g., LLM judge) is available for every sample, while an expensive ground-truth label can only be acquired selectively when using a human for auditing. Unlike classical multi-fidelity BAI, the proxy is biased (arm- and context-dependent) and ground truth is selectively observed. Consequently, standard multi-fidelity methods can mis-select the best arm, and uniform auditing, though accurate, wastes scarce resources and is inefficient. We prove that without bias correction and propensity adjustment, mis-selection probability may not vanish (even with unlimited proxy data). We then develop an estimator for the mean of each arm that combines proxy scores with inverse-propensity-weighted residuals and form anytime-valid confidence sequences for that estimator. Based on the estimator and confidence sequence, we propose an algorithm that adaptively selects and audits arms. The algorithm concentrates audits on unreliable contexts and close arms and we prove that a plug-in Neyman rule achieves near-oracle audit efficiency. Numerical experiments confirm the theoretical guarantees and demonstrate the superior empirical performance of the proposed algorithm.

large language model, machine learning, natural language, (21 more...)

arXiv.org Machine Learning

2601.21471

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Asia > China > Hong Kong (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.85)

Neural Information Processing SystemsDec-26-2025, 15:41:27 GMT

Selectively Sharing Experiences Improves Multi-Agent Reinforcement Learning

We present a novel multi-agent RL approach, Selective Multi-Agent Prioritized Experience Relay, in which agents share with other agents a limited number of transitions they observe during training. The intuition behind this is that even a small number of relevant experiences from other agents could help each agent learn. Unlike many other multi-agent RL algorithms, this approach allows for largely decentralized training, requiring only a limited communication channel between agents. We show that our approach outperforms baseline no-sharing decentralized training and state-of-the art multi-agent RL algorithms. Further, sharing only a small number of highly relevant experiences outperforms sharing all experiences between agents, and the performance uplift from selective experience sharing is robust across a range of hyperparameters and DQN variants.

multi-agent reinforcement learning, name change, selectively, (3 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.84)

Neural Information Processing SystemsOct-3-2025, 09:56:35 GMT

particular, we clarify some potential misunderstandings from R# 3 and provide extra experiments as suggested by R#3

We thank all reviewers for their valuable and constructive comments. Below, we address the detailed comments. It is shown that PR can be extended to "selectively" incorporate uncertain We'll make this clearer in the final version. The odd columns are real data and even ones are the reconstruction results. It was a fault to miss the 8-th column (i.e., the reconstruction We'll fix these issues for better presentation.

constraint, knowledge, penalty term, (15 more...)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.31)

Grinberg, Yuval, Harel, Nimrod, Goldberger, Jacob, Lindenbaum, Ofir

Detect and Correct: A Selective Noise Correction Method for Learning with Noisy Labels

arXiv.org Artificial IntelligenceJul-31-2025

Falsely annotated samples, also known as noisy labels, can significantly harm the performance of deep learning models. Two main approaches for learning with noisy labels are global noise estimation and data filtering. Global noise estimation approximates the noise across the entire dataset using a noise transition matrix, but it can unnecessarily adjust correct labels, leaving room for local improvements. Data filtering, on the other hand, discards potentially noisy samples but risks losing valuable data. Our method identifies potentially noisy samples based on their loss distribution. We then apply a selection process to separate noisy and clean samples and learn a noise transition matrix to correct the loss for noisy samples while leaving the clean data unaffected, thereby improving the training process. Our approach ensures robust learning and enhanced model performance by preserving valuable information from noisy samples and refining the correction process. We applied our method to standard image datasets (MNIST, CIFAR-10, and CIFAR-100) and a biological scRNA-seq cell-type annotation dataset. We observed a significant improvement in model accuracy and robustness compared to traditional methods.

artificial intelligence, machine learning, noisy label, (15 more...)

doi: 10.24132/CSRN.2025-19

2505.13342

Genre:

Research Report > Experimental Study (0.68)
Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.47)

Majumder, Surajit, Ranjan, Paritosh, Roy, Prodip, Padhan, Bhuban

Switch-Based Multi-Part Neural Network

arXiv.org Artificial IntelligenceApr-28-2025

This paper introduces decentralized and modular neural network framework designed to enhance the scalability, interpretability, and performance of artificial intelligence (AI) systems. At the heart of this framework is a dynamic switch mechanism that governs the selective activation and training of individual neurons based on input characteristics, allowing neurons to specialize in distinct segments of the data domain. This approach enables neurons to learn from disjoint subsets of data, mimicking biological brain function by promoting task specialization and improving the interpretability of neural network behavior. Furthermore, the paper explores the application of federated learning and decentralized training for real-world AI deployments, particularly in edge computing and distributed environments. By simulating localized training on non-overlapping data subsets, we demonstrate how modular networks can be efficiently trained and evaluated. The proposed framework also addresses scalability, enabling AI systems to handle large datasets and distributed processing while preserving model transparency and interpretability. Finally, we discuss the potential of this approach in advancing the design of scalable, privacy-preserving, and efficient AI systems for diverse applications.

artificial intelligence, machine learning, neuron, (15 more...)

2504.18241

Genre: Research Report (0.66)

Industry: Information Technology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Neural Information Processing SystemsJan-19-2025, 20:45:08 GMT

Selectively Sharing Experiences Improves Multi-Agent Reinforcement Learning

We present a novel multi-agent RL approach, Selective Multi-Agent Prioritized Experience Relay, in which agents share with other agents a limited number of transitions they observe during training. The intuition behind this is that even a small number of relevant experiences from other agents could help each agent learn. Unlike many other multi-agent RL algorithms, this approach allows for largely decentralized training, requiring only a limited communication channel between agents. We show that our approach outperforms baseline no-sharing decentralized training and state-of-the art multi-agent RL algorithms. Further, sharing only a small number of highly relevant experiences outperforms sharing all experiences between agents, and the performance uplift from selective experience sharing is robust across a range of hyperparameters and DQN variants.

artificial intelligence, machine learning, multi-agent reinforcement learning, (3 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

arXiv.org Artificial IntelligenceDec-18-2024

Cherry-Picking in Time Series Forecasting: How to Select Datasets to Make Your Model Shine

Roque, Luis, Soares, Carlos, Cerqueira, Vitor, Torgo, Luis

The importance of time series forecasting drives continuous research and the development of new approaches to tackle this problem. Typically, these methods are introduced through empirical studies that frequently claim superior accuracy for the proposed approaches. Nevertheless, concerns are rising about the reliability and generalizability of these results due to limitations in experimental setups. This paper addresses a critical limitation: the number and representativeness of the datasets used. We investigate the impact of dataset selection bias, particularly the practice of cherry-picking datasets, on the performance evaluation of forecasting methods. Through empirical analysis with a diverse set of benchmark datasets, our findings reveal that cherry-picking datasets can significantly distort the perceived performance of methods, often exaggerating their effectiveness. Furthermore, our results demonstrate that by selectively choosing just four datasets - what most studies report - 46% of methods could be deemed best in class, and 77% could rank within the top three. Additionally, recent deep learning-based approaches show high sensitivity to dataset selection, whereas classical methods exhibit greater robustness. Finally, our results indicate that, when empirically validating forecasting algorithms on a subset of the benchmarks, increasing the number of datasets tested from 3 to 6 reduces the risk of incorrectly identifying an algorithm as the best one by approximately 40%. Our study highlights the critical need for comprehensive evaluation frameworks that more accurately reflect real-world scenarios. Adopting such frameworks will ensure the development of robust and reliable forecasting methods.

artificial intelligence, data mining, machine learning, (17 more...)

2412.14435

Country:

Europe > Portugal > Lisbon > Lisbon (0.14)
North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.08)
Europe > Portugal > Porto > Porto (0.04)
(3 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceJul-1-2024

Addressing a fundamental limitation in deep vision models: lack of spatial attention

Borji, Ali

The primary aim of this manuscript is to underscore a significant limitation in current deep learning models, particularly vision models. Unlike human vision, which efficiently selects only the essential visual areas for further processing, leading to high speed and low energy consumption, deep vision models process the entire image. In this work, we examine this issue from a broader perspective and propose a solution that could pave the way for the next generation of more efficient vision models. Basically, convolution and pooling operations are selectively applied to altered regions, with a change map sent to subsequent layers. This map indicates which computations need to be repeated. The code is available at https://github.com/aliborji/spatial_attention.

change map, computation, vision model, (14 more...)

2407.01782

Genre: Research Report (0.65)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Gerstgrasser, Matthias, Danino, Tom, Keren, Sarah

Selectively Sharing Experiences Improves Multi-Agent Reinforcement Learning

arXiv.org Artificial IntelligenceNov-1-2023

We present a novel multi-agent RL approach, Selective Multi-Agent Prioritized Experience Relay, in which agents share with other agents a limited number of transitions they observe during training. The intuition behind this is that even a small number of relevant experiences from other agents could help each agent learn. Unlike many other multi-agent RL algorithms, this approach allows for largely decentralized training, requiring only a limited communication channel between agents. We show that our approach outperforms baseline no-sharing decentralized training and state-of-the art multi-agent RL algorithms. Further, sharing only a small number of highly relevant experiences outperforms sharing all experiences between agents, and the performance uplift from selective experience sharing is robust across a range of hyperparameters and DQN variants. A reference implementation of our algorithm is available at https://github.com/mgerstgrasser/super.

multi-agent reinforcement learning, selectively

2311.00865

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)