AITopics | Overview

Overview

Reviews: Is Q-Learning Provably Efficient?

Neural Information Processing SystemsMay-26-2025, 10:08:51 GMT

This paper studies the problem of efficient exploration in finite episodic MDPs. They present a variant of optimistic initialization tuned learning rates for Q-learning that recover a UCB-style algorithm. The main contribution of this work is a polynomial regret bound for perhaps one of the most iconic "model-free" algorithms. There are several things to like about this paper: - Q-learning is perhaps the classic intro to RL algorithms, so it's nice to see that we can recover sample efficient guarantees for a variant of this algorithm. The computational time is also particularly appealing compared to existing model-free algorithms with sqrt{T} *expected* (Bayesian) regret (such as RLSVI), which have much higher computational and memory requirements.

artificial intelligence, machine learning, reinforcement learning, (19 more...)

Neural Information Processing Systems

Genre:

Research Report (0.50)
Overview (0.35)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Reinforcement Learning for Solving the Vehicle Routing Problem

MohammadReza Nazari, Afshin Oroojlooy, Lawrence Snyder, Martin Takac

Neural Information Processing SystemsMay-26-2025, 08:42:53 GMT

We present an end-to-end framework for solving the Vehicle Routing Problem (VRP) using reinforcement learning. In this approach, we train a single policy model that finds near-optimal solutions for a broad range of problem instances of similar size, only by observing the reward signals and following feasibility rules. We consider a parameterized stochastic policy, and by applying a policy gradient algorithm to optimize its parameters, the trained model produces the solution as a sequence of consecutive actions in real time, without the need to re-train for every new problem instance. On capacitated VRP, our approach outperforms classical heuristics and Google's OR-Tools on medium-sized instances in solution quality with comparable computation time (after training). We demonstrate how our approach can handle problems with split delivery and explore the effect of such deliveries on the solution quality. Our proposed framework can be applied to other variants of the VRP such as the stochastic VRP, and has the potential to be applied more generally to combinatorial optimization problems.

machine learning, natural language, reinforcement learning, (19 more...)

Neural Information Processing Systems

Country:

North America > United States > Pennsylvania (0.14)
North America > Canada (0.14)

Genre:

Research Report (0.46)
Overview (0.34)

Industry: Transportation > Freight & Logistics Services (0.71)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Advancing Video Anomaly Detection: A Concise Review and a New Dataset Arjun Raj

Neural Information Processing SystemsMay-26-2025, 03:22:49 GMT

Video Anomaly Detection (VAD) finds widespread applications in security surveillance, traffic monitoring, industrial monitoring, and healthcare. Despite extensive research efforts, there remains a lack of concise reviews that provide insightful guidance for researchers. Such reviews would serve as quick references to grasp current challenges, research trends, and future directions.

artificial intelligence, data mining, machine learning, (19 more...)

Neural Information Processing Systems

Country: Oceania > Australia (0.28)

Genre:

Overview (1.00)
Research Report > New Finding (0.67)

Industry:

Health & Medicine (0.87)
Education (0.67)
Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A survey and benchmark of high-dimensional Bayesian optimization of discrete sequences Richard Michael University of Copenhagen University of Copenhagen Simon Bartels

Neural Information Processing SystemsMay-25-2025, 21:38:23 GMT

Optimizing discrete black box functions is key in several domains, e.g. protein engineering and drug design. Due to the lack of gradient information and the need for sample efficiency, Bayesian optimization is an ideal candidate for these tasks. Several methods for high-dimensional continuous and categorical Bayesian optimization have been proposed recently. However, our survey of the field reveals highly heterogeneous experimental set-ups across methods and technical barriers for the replicability and application of published algorithms to real-world tasks. To address these issues, we develop a unified framework to test a vast array of high-dimensional Bayesian optimization methods and a collection of standardized black box functions representing real-world application domains in chemistry and biology. These two components of the benchmark are each supported by flexible, scalable, and easily extendable software libraries (poli and poli-baselines), allowing practitioners to readily incorporate new optimization objectives or discrete optimizers.

machine learning, natural language, optimization, (14 more...)

Neural Information Processing Systems

Country:

Europe > Denmark > Capital Region > Copenhagen (0.76)
North America > United States (0.67)

Genre: Overview (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.93)

Add feedback

OT4P: Unlocking Effective Orthogonal Group Path for Permutation Relaxation

Neural Information Processing SystemsMay-25-2025, 18:36:30 GMT

Optimization over permutations is typically an NP-hard problem that arises extensively in ranking, matching, tracking, etc. Birkhoff polytope-based relaxation methods have made significant advancements, particularly in penalty-free optimization and probabilistic inference. Relaxation onto the orthogonal group offers unique potential advantages such as a lower representation dimension and preservation of inner products; however, equally effective approaches remain unexplored. To bridge the gap, we present a temperature-controlled differentiable transformation that maps unconstrained vector space to the orthogonal group, where the temperature, in the limit, concentrates orthogonal matrices near permutation matrices. This transformation naturally implements a parameterization for the relaxation of permutation matrices, allowing for gradient-based optimization of problems involving permutations. Additionally, by deriving a re-parameterized gradient estimator, this transformation also provides efficient stochastic optimization over the latent permutations. Extensive experiments involving the optimization over permutation matrices validate the effectiveness of the proposed method.

artificial intelligence, machine learning, survey article, (19 more...)

Neural Information Processing Systems

Country:

Asia > China (0.28)
Asia > Middle East > Jordan (0.14)

Genre:

Research Report > Experimental Study (0.93)
Overview (0.93)

Industry: Health & Medicine (0.67)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
(2 more...)

Add feedback

fc034d186280f55370b6aca7a3285a65-Paper-Conference.pdf

Neural Information Processing SystemsMay-25-2025, 17:31:32 GMT

artificial intelligence, machine learning, survey article, (15 more...)

Neural Information Processing Systems

Country: Asia > China (0.14)

Genre:

Research Report (0.66)
Overview (0.48)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

D4Explainer: In-Distribution GNN Explanations via Discrete Denoising Diffusion

Neural Information Processing SystemsMay-25-2025, 17:12:30 GMT

The widespread deployment of Graph Neural Networks (GNNs) sparks significant interest in their explainability, which plays a vital role in model auditing and ensuring trustworthy graph learning. The objective of GNN explainability is to discern the underlying graph structures that have the most significant impact on model predictions. Ensuring that explanations generated are reliable necessitates consideration of the in-distribution property, particularly due to the vulnerability of GNNs to out-of-distribution data. Unfortunately, prevailing explainability methods tend to constrain the generated explanations to the structure of the original graph, thereby downplaying the significance of the in-distribution property and resulting in explanations that lack reliability. To address these challenges, we propose D4Explainer, a novel approach that provides in-distribution GNN explanations for both counterfactual and model-level explanation scenarios. The proposed D4Explainer incorporates generative graph distribution learning into the optimization objective, which accomplishes two goals: 1) generate a collection of diverse counterfactual graphs that conform to the in-distribution property for a given instance, and 2) identify the most discriminative graph patterns that contribute to a specific class prediction, thus serving as model-level explanations. It is worth mentioning that D4Explainer is the first unified framework that combines both counterfactual and model-level explanations. Empirical evaluations conducted on synthetic and real-world datasets provide compelling evidence of the state-ofthe-art performance achieved by D4Explainer in terms of explanation accuracy, faithfulness, diversity, and robustness.

data mining, explanation, machine learning, (18 more...)

Neural Information Processing Systems

Country: Asia (0.14)

Genre:

Research Report > Promising Solution (0.34)
Overview (0.34)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Data Science > Data Mining (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Add feedback

Anytime-Competitive Reinforcement Learning with Policy Prior

Neural Information Processing SystemsMay-25-2025, 16:46:57 GMT

This paper studies the problem of Anytime-Competitive Markov Decision Process (A-CMDP). Existing works on Constrained Markov Decision Processes (CMDPs) aim to optimize the expected reward while constraining the expected cost over random dynamics, but the cost in a specific episode can still be unsatisfactorily high. In contrast, the goal of A-CMDP is to optimize the expected reward while guaranteeing a bounded cost in each round of any episode against a policy prior. We propose a new algorithm, called Anytime-Competitive Reinforcement Learning (ACRL), which provably guarantees the anytime cost constraints. The regret analysis shows the policy asymptotically matches the optimal reward achievable under the anytime competitive constraints. Experiments on the application of carbonintelligent computing verify the reward performance and cost constraint guarantee of ACRL.

constraint, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.28)

Genre:

Research Report (0.66)
Overview (0.48)
Instructional Material > Course Syllabus & Notes (0.45)

Industry:

Energy > Power Industry (1.00)
Energy > Renewable (0.67)
Energy > Oil & Gas > Upstream (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.54)

Add feedback

FactorizePhys: Matrix Factorization for Multidimensional Attention in Remote Physiological Sensing, Sos S. Agaian 2

Neural Information Processing SystemsMay-25-2025, 13:57:43 GMT

Remote photoplethysmography (rPPG) enables non-invasive extraction of blood volume pulse signals through imaging, transforming spatial-temporal data into time series signals. Advances in end-to-end rPPG approaches have focused on this transformation where attention mechanisms are crucial for feature extraction. However, existing methods compute attention disjointly across spatial, temporal, and channel dimensions. Here, we propose the Factorized Self-Attention Module (FSAM), which jointly computes multidimensional attention from voxel embeddings using nonnegative matrix factorization. To demonstrate FSAM's effectiveness, we developed FactorizePhys, an end-to-end 3D-CNN architecture for estimating blood volume pulse signals from raw video frames.

artificial intelligence, factorizephy, machine learning, (19 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom (0.28)
North America > United States > New York (0.28)

Genre:

Research Report > Experimental Study (1.00)
Overview (0.93)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.49)
Health & Medicine > Diagnostic Medicine > Imaging (0.48)
Health & Medicine > Therapeutic Area > Hematology (0.34)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Vision > Face Recognition (0.67)

Add feedback

EvoFed: Leveraging Evolutionary Strategies for Communication-Efficient Federated Learning

Neural Information Processing SystemsMay-25-2025, 11:11:59 GMT

Federated Learning (FL) is a decentralized machine learning paradigm that enables collaborative model training across dispersed nodes without having to force individual nodes to share data. However, its broad adoption is hindered by the high communication costs of transmitting a large number of model parameters. This paper presents EvoFed, a novel approach that integrates Evolutionary Strategies (ES) with FL to address these challenges. EvoFed employs a concept of'fitness-based information sharing', deviating significantly from the conventional model-based FL. Rather than exchanging the actual updated model parameters, each node transmits a distance-based similarity measure between the locally updated model and each member of the noise-perturbed model population.

artificial intelligence, evolutionary algorithm, machine learning, (16 more...)

Neural Information Processing Systems

Country: