AITopics

Regularized Q-Learning

Neural Information Processing SystemsMar-27-2025, 13:32:01 GMT

Q-learning is widely used algorithm in reinforcement learning (RL) community. Under the lookup table setting, its convergence is well established. However, its behavior is known to be unstable with the linear function approximation case. This paper develops a new Q-learning algorithm, called RegQ, that converges when linear function approximation is used. We prove that simply adding an appropriate regularization term ensures convergence of the algorithm. Its stability is established using a recent analysis tool based on switching system models. Moreover, we experimentally show that RegQ converges in environments where Q-learning with linear function approximation was known to diverge. An error bound on the solution where the algorithm converges is also given.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

Neural Information Processing Systems

Genre: Research Report > Experimental Study (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

94bcb01789fccf15afe2764d8fe0f40e-Paper-Conference.pdf

Neural Information Processing SystemsMar-27-2025, 13:31:51 GMT

artificial intelligence, machine learning, poisoning attack, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > California (0.68)
Europe (0.68)
North America > United States > Maryland (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Fast yet Safe: Early-Exiting with Risk Control Alexander Timans 1, Tin Hadži Veljković

Neural Information Processing SystemsMar-27-2025, 13:31:45 GMT

Scaling machine learning models significantly improves their performance. However, such gains come at the cost of inference being slow and resource-intensive. Early-exit neural networks (EENNs) offer a promising solution: they accelerate inference by allowing intermediate layers to'exit' and produce a prediction early. Yet a fundamental issue with EENNs is how to determine when to exit without severely degrading performance. In other words, when is it'safe' for an EENN to go'fast'? To address this issue, we investigate how to adapt frameworks of risk control to EENNs. Risk control offers a distribution-free, post-hoc solution that tunes the EENN's exiting mechanism so that exits only occur when the output is of sufficient quality. We empirically validate our insights on a range of vision and language tasks, demonstrating that risk control can produce substantial computational savings, all the while preserving user-specified performance goals.

artificial intelligence, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country: North America > Canada (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

b6089408f4893289296ad0499783b3a6-Supplemental-Conference.pdf

Neural Information Processing SystemsMar-27-2025, 13:31:37 GMT

artificial intelligence, experiment, machine learning, (19 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Sparse Probabilistic Circuits via Pruning and Growing Meihua Dang Anji Liu Guy Van den Broeck CS Department CS Department CS Department UCLA

Neural Information Processing SystemsMar-27-2025, 13:31:33 GMT

Probabilistic circuits (PCs) are a tractable representation of probability distributions allowing for exact and efficient computation of likelihoods and marginals. There has been significant recent progress on improving the scale and expressiveness of PCs.

Add feedback

Semi-Random Matrix Completion via Flow-Based Adaptive Reweighting Jonathan A. Kelner Jerry Li

Neural Information Processing SystemsMar-27-2025, 13:31:26 GMT

Since worst-case statistical inference problems are often intractable (i.e., without distributional

data mining, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country: North America > United States > Texas (0.14)

Genre: Research Report > Experimental Study (0.92)

Industry: Information Technology > Services (0.45)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)
(3 more...)

Add feedback

Simple and Controllable Music Generation Jade Copet Felix Kreuk

Neural Information Processing SystemsMar-27-2025, 13:31:19 GMT

We tackle the task of conditional music generation.

arxiv preprint arxiv, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country: Asia > Middle East (0.28)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)

Technology:

Information Technology > Artificial Intelligence > Speech (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

94b472a1842cd7c56dcb125fb2765fbd-Paper-Conference.pdf

Neural Information Processing SystemsMar-27-2025, 13:31:16 GMT

arxiv preprint arxiv, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)

Technology:

Information Technology > Artificial Intelligence > Speech (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Geometry-aware training of factorized layers in tensor Tucker format

Neural Information Processing SystemsMar-27-2025, 13:29:16 GMT

Reducing parameter redundancies in neural network architectures is crucial for achieving feasible computational and memory requirements during training and inference phases. Given its easy implementation and flexibility, one promising approach is layer factorization, which reshapes weight tensors into a matrix format and parameterizes them as the product of two small rank matrices. However, this approach typically requires an initial full-model warm-up phase, prior knowledge of a feasible rank, and it is sensitive to parameter initialization. In this work, we introduce a novel approach to train the factors of a Tucker decomposition of the weight tensors. Our training proposal proves to be optimal in locally approximating the original unfactorized dynamics independently of the initialization. Furthermore, the rank of each mode is dynamically updated during training. We provide a theoretical analysis of the algorithm, showing convergence, approximation and local descent guarantees. The method's performance is further illustrated through a variety of experiments, showing remarkable training compression rates and comparable or even better performance than the full baseline and alternative layer factorization strategies.

artificial intelligence, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country:

Europe (1.00)
North America > United States > New York (0.14)

Genre: Research Report > Experimental Study (0.93)

Industry:

Energy (0.93)
Government > Regional Government > North America Government > United States Government (0.68)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

TransBoost: Improving the Best ImageNet Performance using Deep Transduction Supplementary Material

Neural Information Processing SystemsMar-27-2025, 13:29:09 GMT

Department of Computer Science Department of Computer Science Technion - Israel Institute of Technology Technion - Israel Institute of Technology omer.be@cs.technion.ac.il guy.b@cs.technion.ac.il In general TransBoost is particularly useful when we are able to accumulate a test set of instances and then finetune a specialized model to predict their labels. This setting has numerous use cases in various application fields including: Medicine Medical diagnosis is one possible meaningful use case. In this case, medical records can be gathered on a daily or weekly basis. TransBoost can then be used to finetune transductive models on top of existing inductive models in order to provide more reliable results for these specific records.

artificial intelligence, machine learning, transboost, (15 more...)

Neural Information Processing Systems

Country: Asia > Middle East > Israel (0.45)

Industry: Health & Medicine (0.54)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Add feedback

Filters

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Regularized Q-Learning

94bcb01789fccf15afe2764d8fe0f40e-Paper-Conference.pdf

Fast yet Safe: Early-Exiting with Risk Control Alexander Timans 1, Tin Hadži Veljković

b6089408f4893289296ad0499783b3a6-Supplemental-Conference.pdf

Sparse Probabilistic Circuits via Pruning and Growing Meihua Dang Anji Liu Guy Van den Broeck CS Department CS Department CS Department UCLA

Semi-Random Matrix Completion via Flow-Based Adaptive Reweighting Jonathan A. Kelner Jerry Li

Simple and Controllable Music Generation Jade Copet Felix Kreuk

94b472a1842cd7c56dcb125fb2765fbd-Paper-Conference.pdf

Geometry-aware training of factorized layers in tensor Tucker format

TransBoost: Improving the Best ImageNet Performance using Deep Transduction Supplementary Material