AITopics

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Dynamic Local Regret for Non-convex Online Forecasting

Sergul Aydore, Tianhao Zhu, Dean P. Foster

Neural Information Processing SystemsMay-31-2025, 17:12:12 GMT

We consider online forecasting problems for non-convex machine learning models. Forecasting introduces several challenges such as (i) frequent updates are necessary to deal with concept drift issues since the dynamics of the environment change over time, and (ii) the state of the art models are non-convex models. We address these challenges with a novel regret framework. Standard regret measures commonly do not consider both dynamic environment and non-convex models. We introduce a local regret for non-convex models in a dynamic environment. We present an update rule incurring a cost, according to our proposed local regret, which is sublinear in time T. Our update uses time-smoothed gradients. Using a real-world dataset we show that our time-smoothed approach yields several benefits when compared with state-of-the-art competitors: results are more stable against new data; training is more robust to hyperparameter selection; and our approach is more computationally efficient than the alternatives.

artificial intelligence, machine learning, optimization problem, (19 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre: Research Report (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)

Add feedback

50a074e6a8da4662ae0a29edde722179-AuthorFeedback.pdf

Neural Information Processing SystemsMay-31-2025, 17:11:57 GMT

REVIEWER 2 Thank you for your encouraging comments. REVIEWER 3 Thank you for your comments. REVIEWER 4 Thank you for your comments. Without some formal notion or even toy scenario for concept drift, it's not clear what theoretical basis there is to prefer Call this the oracle policy. Call this the stale policy.

artificial intelligence, local regret, machine learning, (14 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.74)

Add feedback

APPENDIX

Neural Information Processing SystemsMay-31-2025, 17:11:54 GMT

Universal approximation for densities is a property often discussed in the context of autoregressive normalizing flows. It can be shown, based on the proof of existence and non-uniqueness of solutions to the nonlinear ICA problem [29], that any distribution can be mapped onto a factorized base distribution by an invertible function with triangular Jacobian, provided that the function class used for this mapping is large enough. Normalizing flows with triangular Jacobians and a high number of parameters therefore have this approximation capacity (see e.g.

artificial intelligence, deep learning, machine learning, (19 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Relative gradient optimization of the Jacobian term in unsupervised deep learning Luigi Gresele

Neural Information Processing SystemsMay-31-2025, 17:11:47 GMT

Learning expressive probabilistic models correctly describing the data is a ubiquitous problem in machine learning. A popular approach for solving it is mapping the observations into a representation space with a simple joint distribution, which can typically be written as a product of its marginals -- thus drawing a connection with the field of nonlinear independent component analysis. Deep density models have been widely used for this task, but their maximum likelihood based training requires estimating the log-determinant of the Jacobian and is computationally expensive, thus imposing a trade-off between computation and expressive power. In this work, we propose a new approach for exact training of such neural networks. Based on relative gradients, we exploit the matrix structure of neural network parameters to compute updates efficiently even in high-dimensional spaces; the computational cost of the training is quadratic in the input size, in contrast with the cubic scaling of naive approaches. This allows fast training with objective functions involving the log-determinant of the Jacobian, without imposing constraints on its structure, in stark contrast to autoregressive normalizing flows.

artificial intelligence, deep learning, machine learning, (17 more...)

Neural Information Processing Systems

Country:

Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)
Asia > Middle East > Israel (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.64)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)

Add feedback

c10f48884c9c7fdbd9a7959c59eebea8-AuthorFeedback.pdf

Neural Information Processing SystemsMay-31-2025, 17:11:36 GMT

We thank the reviewers for their comments and the largely positive feedback. Reviewers agree that "the paper clearly The improvement our approach provides "is demonstrated by experiments" The contribution was praised as "elegant", R6: Rigorous formulation and convergence properties of relative gradient: We will add more details on this. We will include these references in the paper. These architectures have several limitations, e.g. they The drawback in this approach is that the permutation matrix P cannot be learned. We will include this discussion and reference in the paper.

artificial intelligence, gradient, relative gradient, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.33)

Add feedback

Planning in entropy-regularized Markov decision processes and games

Jean-Bastien Grill, Omar Darwiche Domingues, Pierre Menard, Remi Munos, Michal Valko

Neural Information Processing SystemsMay-31-2025, 17:11:25 GMT

We propose SmoothCruiser, a new planning algorithm for estimating the value function in entropy-regularized Markov decision processes and two-player games, given a generative model of the environment.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

Neural Information Processing Systems

Country:

North America > Canada (0.14)
Europe > France (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.62)

Add feedback

Transferable Adversarial Attacks on SAM and Its Downstream Models

Neural Information Processing SystemsMay-31-2025, 17:09:34 GMT

The utilization of large foundational models has a dilemma: while fine-tuning downstream tasks from them holds promise for making use of the well-generalized knowledge in practical applications, their open accessibility also poses threats of adverse usage. This paper, for the first time, explores the feasibility of adversarial attacking various downstream models fine-tuned from the segment anything model (SAM), by solely utilizing the information from the open-sourced SAM. In contrast to prevailing transfer-based adversarial attacks, we demonstrate the existence of adversarial dangers even without accessing the downstream task and dataset to train a similar surrogate model. To enhance the effectiveness of the adversarial attack towards models fine-tuned on unknown datasets, we propose a universal meta-initialization (UMI) algorithm to extract the intrinsic vulnerability inherent in the foundation model, which is then utilized as the prior knowledge to guide the generation of adversarial perturbations. Moreover, by formulating the gradient difference in the attacking process between the open-sourced SAM and its fine-tuned downstream models, we theoretically demonstrate that a deviation occurs in the adversarial update direction by directly maximizing the distance of encoded feature embeddings in the open-sourced SAM. Consequently, we propose a gradient robust loss that simulates the associated uncertainty with gradient-based noise augmentation to enhance the robustness of generated adversarial examples (AEs) towards this deviation, thus improving the transferability. Extensive experiments demonstrate the effectiveness of the proposed universal meta-initialized and gradient robust adversarial attack (UMI-GRAT) toward SAMs and their downstream models.

artificial intelligence, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country: Asia (0.28)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
(2 more...)

Add feedback

LLMCBench: Benchmarking Large Language Model Compression for Efficient Deployment

Neural Information Processing SystemsMay-31-2025, 17:09:12 GMT

Although large language models (LLMs) have demonstrated their strong intelligence ability, the high demand for computation and storage hinders their practical application. To this end, many model compression techniques are proposed to increase the efficiency of LLMs. However, current researches only validate their methods on limited models, datasets, metrics, etc, and still lack a comprehensive evaluation under more general scenarios. So it is still a question of which model compression approach we should use under a specific case. To mitigate this gap, we present the Large Language Model Compression Benchmark (LLMCBench), a rigorously designed benchmark with an in-depth analysis for LLM compression algorithms. We first analyze the actual model production requirements and carefully design evaluation tracks and metrics. Then, we conduct extensive experiments and comparison using multiple mainstream LLM compression approaches. Finally, we perform an in-depth analysis based on the evaluation and provide useful insight for LLM compression design. We hope our LLMCBench can contribute insightful suggestions for LLM compression algorithm design and serve as a foundation for future research.

large language model, natural language, quantization, (14 more...)

Neural Information Processing Systems

Country: Asia > China (0.14)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Figure 1: Phase transitions of ˆv(m, 0.2, 0.01, s) Figure 2: Phase transitions of ˆv(m, 0.05, 0.05, s)

Neural Information Processing SystemsMay-31-2025, 17:08:31 GMT

Thank you very much for your reviews. The trends match trends in the submission as expected. As mentioned in footnote 20, the design is based on Section 3.1 of [13] (for s = 1). How do I compute ˆδ(s, m, ɛ, w)? When ɛ = 0.07 and m = 500, there is similarly a local maximum (somewhere in 24 s 32) followed by a I appreciate that you read through my supplementary material, and I will certainly address the typos you noted.

artificial intelligence, phase transition, vector, (11 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.79)

Add feedback

Simultaneously Learning Stochastic and Adversarial Episodic MDPs with Known Transition

Neural Information Processing SystemsMay-31-2025, 17:08:28 GMT

This work studies the problem of learning episodic Markov Decision Processes with known transition and bandit feedback. We develop the first algorithm with a "best-of-both-worlds" guarantee: it achieves O(log T) regret when the losses are stochastic, and simultaneously enjoys worst-case robustness with Õ( T) regret even when the losses are adversarial, where T is the number of episodes. More generally, it achieves Õ( C) regret in an intermediate setting where the losses are corrupted by a total amount of C. Our algorithm is based on the Followthe-Regularized-Leader method from Zimin and Neu [26], with a novel hybrid regularizer inspired by recent works of Zimmert et al. [27, 29] for the special case of multi-armed bandits. Crucially, our regularizer admits a non-diagonal Hessian with a highly complicated inverse. Analyzing such a regularizer and deriving a particular self-bounding regret guarantee is our key technical contribution and might be of independent interest.

algorithm, artificial intelligence, machine learning, (19 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.70)

Add feedback