AITopics | Statistical Learning

We consider the problem of online learning in the presence of distribution shifts that occur at an unknown rate and of unknown intensity. We derive a new Bayesian online inference approach to simultaneously infer these distribution shifts and adapt the model to the detected changes by integrating ideas from change point detection, switching dynamical systems, and Bayesian online learning. Using a binary'change variable,' we construct an informative prior such that-if a change is detected-the model partially erases the information of past model updates by tempering to facilitate adaptation to the new data distribution. Furthermore, the approach uses beam search to track multiple change-point hypotheses and selects the most probable one in hindsight. Our proposed method is model-agnostic, applicable in both supervised and unsupervised learning settings, suitable for an environment of concept drifts or covariate drifts, and yields improvements over state-of-the-art Bayesian online learning approaches.

artificial intelligence, bayesian inference, machine learning, (17 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.28)

Genre: Research Report (0.46)

Industry:

Education > Educational Setting > Online (1.00)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Enterprise Applications > Human Resources > Learning Management (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
(2 more...)

Add feedback

35a12c43227f217207d4e06ffefe39d3-Paper.pdf

Neural Information Processing SystemsApr-25-2026, 11:02:41 GMT

artificial intelligence, dimension, machine learning, (15 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

19dbb86f771ddbf9986cf0c9b1c61c17-Supplemental-Conference.pdf

Neural Information Processing SystemsApr-25-2026, 11:02:34 GMT

artificial intelligence, generalization, machine learning, (19 more...)

Neural Information Processing Systems

Country: Asia (0.28)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Time-Independent Information-Theoretic Generalization Bounds for SGLD

Neural Information Processing SystemsApr-25-2026, 11:02:30 GMT

We provide novel information-theoretic generalization bounds for stochastic gradient Langevin dynamics (SGLD) under the assumptions of smoothness and dissipativity, which are widely used in sampling and non-convex optimization studies. Our bounds are time-independent and decay to zero as the sample size increases, regardless of the number of iterations and whether the step size is fixed. Unlike previous studies, we derive the generalization error bounds by focusing on the time evolution of the Kullback-Leibler divergence, which is related to the stability of datasets and is the upper bound of the mutual information between output parameters and an input dataset. Additionally, we establish the first information-theoretic generalization bound when the training and test loss are the same by showing that a loss function of SGLD is sub-exponential. This bound is also time-independent and removes the problematic step size dependence in existing work, leading to an improved excess risk bound by combining our analysis with the existing non-convex optimization error bounds.

artificial intelligence, generalization, machine learning, (14 more...)

Neural Information Processing Systems

Country: Asia (0.28)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.36)

Add feedback

34adeb8e3242824038aa65460a47c29e-Paper.pdf

Neural Information Processing SystemsApr-25-2026, 10:47:55 GMT

data mining, machine learning, subgraph, (18 more...)

Neural Information Processing Systems

Industry:

Health & Medicine (1.00)
Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Communications (1.00)
Information Technology > Data Science > Data Mining (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.70)
(2 more...)

Add feedback

1967f962c7c2083618236d80eeb9d1ac-Paper-Conference.pdf

Neural Information Processing SystemsApr-25-2026, 10:30:26 GMT

artificial intelligence, inductive learning, machine learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States (0.46)
Europe (0.28)

Technology:

Information Technology > Data Science (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Learnability of Linear Thresholds from Label Proportions

Neural Information Processing SystemsApr-25-2026, 10:28:10 GMT

We study the problem of properly learning linear threshold functions (LTFs) in the learning from label proportions (LLP) framework. In this, the learning is on a collection of bags of feature-vectors with only the proportion of labels available for each bag. First, we provide an algorithm that, given a collection of such bags each of size at most two whose label proportions are consistent with (i.e., the bags are satisfied by) an unknown LTF, efficiently produces an LTF that satisfies at least (2/5)-fraction of the bags. If all the bags are non-monochromatic (i.e., bags of size two with differently labeled feature-vectors) the algorithm satisfies at least (1/2)-fraction of them. For the special case of OR over the d-dimensional boolean vectors, we give an algorithm which computes an LTF achieving an additional Ω(1/d) in accuracy for the two cases.

artificial intelligence, ltf, machine learning, (17 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

Add feedback

Grad2Task: Improved Few-shot Text Classification Using Gradients for Task Representation

Neural Information Processing SystemsApr-25-2026, 10:12:34 GMT

Large pretrained language models (LMs) like BERT have improved performance in many disparate natural language processing (NLP) tasks. However, fine tuning such models requires a large number of training examples for each target task. Simultaneously, many realistic NLP problems are "few shot", without a sufficiently large training set. In this work, we propose a novel conditional neural process-based approach for few-shot text classification that learns to transfer from other diverse tasks with rich annotation. Our key idea is to represent each task using gradient information from a base model and to train an adaptation network that modulates a text classifier conditioned on the task representation. While previous task-aware few-shot learners represent tasks by input encoding, our novel task representation is more powerful, as the gradient captures input-output relationships of a task. Experimental results show that our approach outperforms traditional fine-tuning, sequential transfer learning, and state-of-the-art meta learning approaches on a collection of diverse few-shot tasks. We further conducted analysis and ablations to justify our design choices.

machine learning, natural language, text classification, (17 more...)

Neural Information Processing Systems

Country: