AITopics | taxnodes:Technology: Overviews

Plotting

taxnodes:Technology: Overviews

News Overviews Instructional Materials AI-Alerts Classics

A Additional Related Works We review the recent studies in OOD detection, model reprogramming, and backdoor attack

Neural Information Processing SystemsMar-23-2025, 06:57:54 GMT

We review the recent studies in OOD detection, model reprogramming, and backdoor attack. A.1 OOD Detection Following [58], we attribute existing works into three categories, namely, the classification-based methods, the density-based methods, and the distance-based methods. In general, these methods aim to maximize the gap between ID and OOD data regarding specified metrics in identifying OOD data. The classification-based methods use the representations extracted from the well-trained classification models in OOD scoring. For example, [17, 32, 33, 37, 46, 51] employ logit outputs from models in estimating the confidence of ID data; [28, 44] adopt Mahalanobis distance and Gram Matrix to exploit models' detection capability from embedding features; [21, 32] further demonstrate the importance of gradient information, either perturbing inputs with its gradients or directly using the gradient norm in scoring.

artificial intelligence, free energy, machine learning, (14 more...)

Neural Information Processing Systems

Genre:

Overview (1.00)
Research Report > New Finding (0.46)

Industry: Information Technology > Security & Privacy (0.75)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Add feedback

63b2b056f48653b7cff0d8d233c96a4d-Paper-Conference.pdf

Neural Information Processing SystemsMar-23-2025, 06:16:07 GMT

artificial intelligence, machine learning, reinforcement learning, (12 more...)

Neural Information Processing Systems

Country: North America > United States (0.46)

Genre:

Overview (0.67)
Research Report (0.46)

Industry: Leisure & Entertainment > Games > Computer Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Data Science > Data Mining (0.93)

Add feedback

fdda6e957f1e5ee2f3b311fe4f145ae1-Paper.pdf

Neural Information Processing SystemsMar-23-2025, 05:32:57 GMT

machine learning, natural language, variance, (19 more...)

Neural Information Processing Systems

Country:

North America > United States > New York > New York County > New York City (0.14)
North America > United States > Indiana > Tippecanoe County (0.14)
North America > Canada > Ontario > Toronto (0.14)

Genre:

Research Report > New Finding (0.93)
Overview (0.68)
Research Report > Experimental Study (0.68)

Industry:

Law (1.00)
Information Technology (0.93)
Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Vision (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
(5 more...)

Add feedback

Parameter Prediction for Unseen Deep Architectures Boris Knyazev 1,2 Graham W. Taylor 1,2,3, Adriana Romero-Soriano

Neural Information Processing SystemsMar-23-2025, 02:19:50 GMT

Deep learning has been successful in automating the design of features in machine learning pipelines. However, the algorithms optimizing neural network parameters remain largely hand-designed and computationally inefficient. We study if we can use deep learning to directly predict these parameters by exploiting the past knowledge of training other networks.

artificial intelligence, machine learning, survey article, (16 more...)

Neural Information Processing Systems

Country: North America > Canada (0.68)

Genre: Overview (0.46)

Industry: Government (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

DSelect-k: Differentiable Selection in the Mixture of Experts with Applications to Multi-Task Learning Hussein Hazimeh 1

Neural Information Processing SystemsMar-23-2025, 02:08:20 GMT

The Mixture-of-Experts (MoE) architecture is showing promising results in improving parameter sharing in multi-task learning (MTL) and in scaling high-capacity neural networks. State-of-the-art MoE models use a trainable "sparse gate" to select a subset of the experts for each input example. While conceptually appealing, existing sparse gates, such as Top-k, are not smooth. The lack of smoothness can lead to convergence and statistical performance issues when training with gradientbased methods. In this paper, we develop DSelect-k: a continuously differentiable and sparse gate for MoE, based on a novel binary encoding formulation. The gate can be trained using first-order methods, such as stochastic gradient descent, and offers explicit control over the number of experts to select. We demonstrate the effectiveness of DSelect-k on both synthetic and real MTL datasets with up to 128 tasks. Our experiments indicate that DSelect-k can achieve statistically significant improvements in prediction and expert selection over popular MoE gates. Notably, on a real-world, large-scale recommender system, DSelect-k achieves over 22% improvement in predictive performance compared to Top-k.

artificial intelligence, dselect-k, machine learning, (17 more...)

Neural Information Processing Systems

Country: North America > United States (0.46)

Genre: Overview (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Hardness in Markov Decision Processes: Theory and Practice

Neural Information Processing SystemsMar-23-2025, 01:00:22 GMT

Meticulously analysing the empirical strengths and weaknesses of reinforcement learning methods in hard (challenging) environments is essential to inspire innovations and assess progress in the field. In tabular reinforcement learning, there is no well-established standard selection of environments to conduct such analysis, which is partially due to the lack of a widespread understanding of the rich theory of hardness of environments. The goal of this paper is to unlock the practical usefulness of this theory through four main contributions. First, we present a systematic survey of the theory of hardness, which also identifies promising research directions. Second, we introduce Colosseum, a pioneering package that enables empirical hardness analysis and implements a principled benchmark composed of environments that are diverse with respect to different measures of hardness. Third, we present an empirical analysis that provides new insights into computable measures. Finally, we benchmark five tabular agents in our newly proposed benchmark. While advancing the theoretical understanding of hardness in non-tabular reinforcement learning remains essential, our contributions in the tabular setting are intended as solid steps towards a principled non-tabular benchmark. Accordingly, we benchmark four agents in non-tabular versions of Colosseum environments, obtaining results that demonstrate the generality of tabular hardness measures.

machine learning, mg-empty, reinforcement learning, (17 more...)

Neural Information Processing Systems

Genre: Overview (0.34)

Industry: Leisure & Entertainment > Games > Computer Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.84)

Add feedback

Model Compression with Adversarial Robustness: A Unified Optimization Framework

Shupeng Gui, Haotao N. Wang, Haichuan Yang, Chen Yu, Zhangyang Wang, Ji Liu

Neural Information Processing SystemsMar-23-2025, 00:32:37 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, arxiv preprint arxiv, machine learning, (16 more...)

Neural Information Processing Systems

Country: North America (0.46)

Genre:

Research Report (0.68)
Overview (0.46)

Industry: Information Technology (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

Learning Non-Convergent Non-Persistent Short-Run MCMC Toward Energy-Based Model

Erik Nijkamp, Mitch Hill, Song-Chun Zhu, Ying Nian Wu

Neural Information Processing SystemsMar-23-2025, 00:21:11 GMT

This paper studies a curious phenomenon in learning energy-based model (EBM) using MCMC. In each learning iteration, we generate synthesized examples by running a non-convergent, non-mixing, and non-persistent short-run MCMC toward the current model, always starting from the same initial distribution such as uniform noise distribution, and always running a fixed number of MCMC steps. After generating synthesized examples, we then update the model parameters according to the maximum likelihood learning gradient, as if the synthesized examples are fair samples from the current model. We treat this non-convergent short-run MCMC as a learned generator model or a flow model. We provide arguments for treating the learned non-convergent short-run MCMC as a valid model. We show that the learned short-run MCMC is capable of generating realistic images. More interestingly, unlike traditional EBM or MCMC, the learned short-run MCMC is capable of reconstructing observed images and interpolating between images, like generator or flow models. The code can be found in the Appendix.

artificial intelligence, machine learning, short-run mcmc, (14 more...)

Neural Information Processing Systems

Country:

Europe (0.93)
North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > California > Los Angeles County > Long Beach (0.14)
North America > Canada > Alberta > Census Division No. 15 > Improvement District No. 9 > Banff (0.14)

Genre:

Research Report (0.34)
Overview (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.35)

Add feedback

Optimization Algorithm Design via Electric Circuits

Neural Information Processing SystemsMar-23-2025, 00:04:49 GMT

We present a novel methodology for convex optimization algorithm design using ideas from electric RLC circuits. Given an optimization problem, the first stage of the methodology is to design an appropriate electric circuit whose continuoustime dynamics converge to the solution of the optimization problem at hand. Then, the second stage is an automated, computer-assisted discretization of the continuous-time dynamics, yielding a provably convergent discrete-time algorithm. Our methodology recovers many classical (distributed) optimization algorithms and enables users to quickly design and explore a wide range of new algorithms with convergence guarantees.

algorithm, artificial intelligence, survey article, (16 more...)

Neural Information Processing Systems

Country: North America > United States (0.27)

Genre:

Research Report > Experimental Study (0.92)
Instructional Material > Course Syllabus & Notes (0.67)
Overview (0.67)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)

Add feedback

Efficient Transformed Gaussian Process State-Space Models for Non-Stationary High-Dimensional Dynamical Systems

Lin, Zhidi, Li, Ying, Yin, Feng, Maroñas, Juan, Thiéry, Alexandre H.

arXiv.org Machine LearningMar-23-2025

Gaussian process state-space models (GPSSMs) have emerged as a powerful framework for modeling dynamical systems, offering interpretable uncertainty quantification and inherent regularization. However, existing GPSSMs face significant challenges in handling high-dimensional, non-stationary systems due to computational inefficiencies, limited scalability, and restrictive stationarity assumptions. In this paper, we propose an efficient transformed Gaussian process state-space model (ETGPSSM) to address these limitations. Our approach leverages a single shared Gaussian process (GP) combined with normalizing flows and Bayesian neural networks, enabling efficient modeling of complex, high-dimensional state transitions while preserving scalability. To address the lack of closed-form expressions for the implicit process in the transformed GP, we follow its generative process and introduce an efficient variational inference algorithm, aided by the ensemble Kalman filter (EnKF), to enable computationally tractable learning and inference. Extensive empirical evaluations on synthetic and real-world datasets demonstrate the superior performance of our ETGPSSM in system dynamics learning, high-dimensional state estimation, and time-series forecasting, outperforming existing GPSSMs and neural network-based methods in both accuracy and computational efficiency.

artificial intelligence, machine learning, modeling & simulation, (18 more...)

arXiv.org Machine Learning

2503.18309

Country: Asia > China (0.28)

Genre: