AITopics | Educational Setting

Collaborating Authors

Educational Setting

Energy-based Hopfield Boosting for Out-of-Distribution Detection Claus Hofmann 1 Simon Schmid 2 Daniel Klotz

Neural Information Processing SystemsMar-27-2025, 14:07:39 GMT

Out-of-distribution (OOD) detection is critical when deploying machine learning models in the real world. Outlier exposure methods, which incorporate auxiliary outlier data in the training process, can drastically improve OOD detection performance compared to approaches without advanced training strategies. We introduce Hopfield Boosting, a boosting approach, which leverages modern Hopfield energy to sharpen the decision boundary between the in-distribution and OOD data. Hopfield Boosting encourages the model to focus on hard-to-distinguish auxiliary outlier examples that lie close to the decision boundary between in-distribution and auxiliary outlier data. Our method achieves a new state-of-the-art in OOD detection with outlier exposure, improving the FPR95 from 2.28 to 0.92 on CIFAR-10, from 11.76 to 7.94 on CIFAR-100, and from 50.74 to 36.60 on ImageNet-1K.

artificial intelligence, detection, machine learning, (16 more...)

Neural Information Processing Systems

Country:

North America > United States (0.28)
Europe > Spain (0.27)
Europe > Austria > Upper Austria (0.14)
North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report > Experimental Study (1.00)

Industry:

Information Technology (0.67)
Health & Medicine (0.67)
Education > Educational Setting > Online (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Sharpness-Aware Minimization Activates the Interactive Teaching's Understanding and Optimization School of Artificial Intelligence, Jilin University, China

Neural Information Processing SystemsMar-27-2025, 14:03:43 GMT

Teaching is a potentially effective approach for understanding interactions among multiple intelligences. Previous explorations have convincingly shown that teaching presents additional opportunities for observation and demonstration within the learning model, such as data distillation and selection. However, the underlying optimization principles and convergence of interactive teaching lack theoretical analysis, and in this regard co-teaching serves as a notable prototype. In this paper, we discuss its role as a reduction of the larger loss landscape derived from Sharpness-Aware Minimization (SAM). Then, we classify it as an iterative parameter estimation process using Expectation-Maximization. The convergence of this typical interactive teaching is achieved by continuously optimizing a variational lower bound on the log marginal likelihood. This lower bound represents the expected value of the log posterior distribution of the latent variables under a scaled, factorized variational distribution. To further enhance interactive teaching's performance, we incorporate SAM's strong generalization information into interactive teaching, referred as Sharpness Reduction Interactive Teaching (SRIT). This integration can be viewed as a novel sequential optimization process.

artificial intelligence, machine learning, optimization problem, (15 more...)

Neural Information Processing Systems

Country: Asia > China (0.50)

Genre: Research Report > Experimental Study (1.00)

Industry: Education > Educational Setting > Online (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)
(2 more...)

Add feedback

Bayesian Online Natural Gradient (BONG)

Neural Information Processing SystemsMar-27-2025, 13:53:28 GMT

We propose a novel approach to sequential Bayesian inference based on variational Bayes (VB). The key insight is that, in the online setting, we do not need to add the KL term to regularize to the prior (which comes from the posterior at the previous timestep); instead we can optimize just the expected log-likelihood, performing a single step of natural gradient descent starting at the prior predictive. We prove this method recovers exact Bayesian inference if the model is conjugate. We also show how to compute an efficient deterministic approximation to the VB objective, as well as our simplified objective, when the variational distribution is Gaussian or a sub-family, including the case of a diagonal plus low-rank precision matrix. We show empirically that our method outperforms other online VB methods in the non-conjugate setting, such as online learning for neural networks, especially when controlling for computational costs.

approximation, artificial intelligence, machine learning, (19 more...)

Neural Information Processing Systems

Country:

North America > United States (0.14)
Europe > Spain (0.14)

Genre: Research Report > Experimental Study (1.00)

Industry: Education > Educational Setting > Online (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.92)

Add feedback

Nearly Optimal Best-of-Both-Worlds Algorithms for Online Learning with Feedback Graphs

Neural Information Processing SystemsMar-27-2025, 13:48:05 GMT

This study considers online learning with general directed feedback graphs. For this problem, we present best-of-both-worlds algorithms that achieve nearly tight regret bounds for adversarial environments as well as poly-logarithmic regret bounds for stochastic environments. As Alon et al. [2015] have shown, tight regret bounds depend on the structure of the feedback graph: strongly observable graphs yield minimax regret of Θ(α

artificial intelligence, data mining, machine learning, (16 more...)

Neural Information Processing Systems

Country: Asia > Japan (0.29)

Genre: Research Report (0.46)

Industry: Education > Educational Setting > Online (0.71)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.71)
Information Technology > Data Science > Data Mining > Big Data (0.46)

Add feedback

Nearly Optimal Best-of-Both-Worlds Algorithms for Online Learning with Feedback Graphs

Neural Information Processing SystemsMar-27-2025, 13:48:01 GMT

artificial intelligence, data mining, machine learning, (16 more...)

Neural Information Processing Systems

Country: Asia > Japan (0.30)

Industry: Education > Educational Setting > Online (0.72)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.72)
Information Technology > Data Science > Data Mining > Big Data (0.47)

Add feedback

b6ffbbacbe2e56f2ec9a0da907382b4a-Paper-Conference.pdf

Neural Information Processing SystemsMar-27-2025, 13:42:05 GMT

artificial intelligence, continual learning, machine learning, (13 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Industry:

Health & Medicine (0.68)
Education > Educational Setting (0.47)
Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)

Add feedback

Improved learning rates in multi-unit uniform price auctions Marius Potfer 1,2 Dorian Baudry 3 Hugo Richard 1

Neural Information Processing SystemsMar-27-2025, 13:38:05 GMT

Motivated by the strategic participation of electricity producers in electricity dayahead market, we study the problem of online learning in repeated multi-unit uniform price auctions focusing on the adversarial opposing bid setting. The main contribution of this paper is the introduction of a new modeling of the bid space.

artificial intelligence, auction, machine learning, (20 more...)

Neural Information Processing Systems

Country: Europe > United Kingdom > England (0.14)

Genre: Research Report > Experimental Study (0.93)

Industry: Education > Educational Setting > Online (0.34)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Simple and Effective Masked Diffusion Language Models

Neural Information Processing SystemsMar-27-2025, 13:37:15 GMT

While diffusion models excel at generating high-quality images, prior work reports a significant performance gap between diffusion and autoregressive (AR) methods in language modeling. In this work, we show that simple masked discrete diffusion is more performant than previously thought. We apply an effective training recipe that improves the performance of masked diffusion models and derive a simplified, Rao-Blackwellized objective that results in additional improvements. Our objective has a simple form--it is a mixture of classical masked language modeling losses-- and can be used to train encoder-only language models that admit efficient samplers, including ones that can generate arbitrary lengths of text semi-autoregressively like a traditional language model. On language modeling benchmarks, a range of masked diffusion models trained with modern engineering practices achieves a new state-of-the-art among diffusion models, and approaches AR perplexity.

artificial intelligence, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country:

Europe (0.67)
North America > United States > Illinois (0.14)
North America > United States > New York (0.14)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Industry:

Leisure & Entertainment > Sports > Football (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Law (1.00)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.74)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Online Adaptation of Language Models with a Memory of Amortized Contexts Jihoon Tack, Eric Mitchell

Neural Information Processing SystemsMar-27-2025, 13:36:57 GMT

Due to the rapid generation and dissemination of information, large language models (LLMs) quickly run out of date despite enormous development costs. To address the crucial need to keep models updated, online learning has emerged as a critical tool when utilizing LLMs for real-world applications. However, given the ever-expanding corpus of unseen documents and the large parameter space of modern LLMs, efficient adaptation is essential. To address these challenges, we propose Memory of Amortized Contexts (MAC), an efficient and effective online adaptation framework for LLMs with strong knowledge retention. We propose a feature extraction and memory-augmentation approach to compress and extract information from new documents into compact modulations stored in a memory bank.

large language model, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country: Europe (0.14)

Genre:

Research Report > Experimental Study (1.00)
Instructional Material (0.67)

Industry: Education > Educational Setting > Online (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Bandit-Feedback Online Multiclass Classification: Variants and Tradeoffs

Neural Information Processing SystemsMar-27-2025, 13:28:01 GMT

Consider the domain of multiclass classification within the adversarial online setting. What is the price of relying on bandit feedback as opposed to full information? To what extent can an adaptive adversary amplify the loss compared to an oblivious one? To what extent can a randomized learner reduce the loss compared to a deterministic one? We study these questions in the mistake bound model and provide nearly tight answers. We demonstrate that the optimal mistake bound under bandit feedback is at most O(k) times higher than the optimal mistake bound in the full information case, where k represents the number of labels. This bound is tight and provides an answer to an open question previously posed and studied by Daniely and Helbertal ['13] and by Long ['17, '20], who focused on deterministic learners.

artificial intelligence, bandit, machine learning, (19 more...)

Neural Information Processing Systems

Country: