AITopics | npn

Collaborating Authors

npn

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Natural-Parameter Networks: A Class of Probabilistic Neural Networks

Hao Wang, Xingjian SHI, Dit-Yan Yeung

Neural Information Processing SystemsApr-22-2026, 15:20:55 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, machine learning, npn, (18 more...)

Neural Information Processing Systems

Country:

North America > United States (0.46)
North America > Canada > Ontario > Toronto (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)

Add feedback

Supplement

Neural Information Processing SystemsFeb-11-2026, 19:26:08 GMT

Assume the conditions in Proposition 1 hold.

artificial intelligence, dataset, machine learning, (19 more...)

Neural Information Processing Systems

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.46)

Add feedback

NeuCLIP: Efficient Large-Scale CLIP Training with Neural Normalizer Optimization

Wei, Xiyuan, Lin, Chih-Jen, Yang, Tianbao

arXiv.org Artificial IntelligenceNov-12-2025

Accurately estimating the normalization term (also known as the partition function) in the contrastive loss is a central challenge for training Contrastive Language-Image Pre-training (CLIP) models. Conventional methods rely on large batches for approximation, demanding substantial computational resources. To mitigate this issue, prior works introduced per-sample normalizer estimators, which are updated at each epoch in a blockwise coordinate manner to keep track of updated encoders. To overcome this limitation, we propose NeuCLIP, a novel and elegant optimization framework based on two key ideas: (i) reformulating the contrastive loss for each sample via convex analysis into a minimization problem with an auxiliary variable representing its log-normalizer; and (ii) transforming the resulting minimization over n auxiliary variables (where n is the dataset size) via variational analysis into the minimization over a compact neural network that predicts the log-normalizers. We design an alternating optimization algorithm that jointly trains the CLIP model and the auxiliary network. By employing a tailored architecture and acceleration techniques for the auxiliary network, NeuCLIP achieves more accurate normalizer estimation, leading to improved performance compared with previous methods. Extensive experiments on large-scale CLIP training, spanning datasets from millions to billions of samples, demonstrate that NeuCLIP outperforms previous methods. Since its introduction, Contrastive Language-Image Pretraining (CLIP) (Radford et al., 2021) has emerged as the de facto standard for vision-language representation learning.

contrastive loss, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2511.08417

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

BabyLM's First Constructions: Causal probing provides a signal of learning

Rozner, Joshua, Weissweiler, Leonie, Shain, Cory

arXiv.org Artificial IntelligenceSep-26-2025

Construction grammar posits that language learners acquire constructions (form-meaning pairings) from the statistics of their environment. Recent work supports this hypothesis by showing sensitivity to constructions in pretrained language models (PLMs), including one recent study (Rozner et al., 2025) demonstrating that constructions shape RoBERTa's output distribution. However, models under study have generally been trained on developmentally implausible amounts of data, casting doubt on their relevance to human language learning. Here we use Rozner et al.'s methods to evaluate construction learning in masked language models from the 2024 BabyLM Challenge. Our results show that even when trained on developmentally plausible quantities of data, models learn diverse constructions, even hard cases that are superficially indistinguishable. We further find correlational evidence that constructional performance may be functionally relevant: models that better represent construction perform better on the BabyLM benchmarks.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2506.02147

Country:

Asia (0.67)
North America > United States > Florida > Miami-Dade County > Miami (0.14)
Europe > United Kingdom > England (0.14)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Reviews: Natural-Parameter Networks: A Class of Probabilistic Neural Networks

Neural Information Processing SystemsJan-20-2025, 23:25:47 GMT

The paper presents a novel and potentially impactful way of learning uncertainty over model parameters. The derivation of novel activation functions for which first and second moments are computable in closed forms (for distributions in the exponential family) appears to be the main (novel) contribution, as this is what allows forward propagation of exponential distributions in the network, and learning of their parameters via backprop. The work does bear some resemblance to earlier work on "Implicit Variance Networks" Bayer et al. which ought to be discussed. On a technical level, the method appears to be effective and the authors empirically verify that: (1) the method is robust to overfitting (2) predictive uncertainty is well calibrated and (3) that propagating distributions over latent states can outperform deterministic methods (e.g. The fact that these second order representations outperform those of VAE is somewhat more surprising and may warrants further experimentation: this would imply that the approximation used by the VAE at inference, is worse than the approximation made by NPN that each layer's activation belongs to the exponential family.

approximation, network, probabilistic neural network, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Natural-Parameter Networks: A Class of Probabilistic Neural Networks

Neural Information Processing SystemsMar-12-2024, 20:12:56 GMT

natural parameter, npn, representation, (15 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > New York (0.04)
North America > United States > New Jersey > Hudson County > Secaucus (0.04)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Neural Plasticity Networks

Li, Yang, Ji, Shihao

arXiv.org Machine LearningAug-13-2019

Neural plasticity is an important functionality of human brain, in which number of neurons and synapses can shrink or expand in response to stimuli throughout the span of life. We model this dynamic learning process as an $L_0$-norm regularized binary optimization problem, in which each unit of a neural network (e.g., weight, neuron or channel, etc.) is attached with a stochastic binary gate, whose parameters determine the level of activity of a unit in the network. At the beginning, only a small portion of binary gates (therefore the corresponding neurons) are activated, while the remaining neurons are in a hibernation mode. As the learning proceeds, some neurons might be activated or deactivated if doing so can be justified by the cost-benefit tradeoff measured by the $L_0$-norm regularized objective. As the training gets mature, the probability of transition between activation and deactivation will diminish until a final hardening stage. We demonstrate that all of these learning dynamics can be modulated by a single parameter $k$ seamlessly. Our neural plasticity network (NPN) can prune or expand a network depending on the initial capacity of network provided by the user; it also unifies dropout (when $k=0$), traditional training of DNNs (when $k=\infty$) and interpolates between these two. To the best of our knowledge, this is the first learning framework that unifies network sparsification and network expansion in an end-to-end training pipeline. Extensive experiments on synthetic dataset and multiple image classification benchmarks demonstrate the superior performance of NPN. We show that both network sparsification and network expansion can yield compact models of similar architectures and of similar predictive accuracies that are close to or sometimes even higher than baseline networks. We plan to release our code to facilitate the research in this area.

artificial intelligence, machine learning, network sparsification, (16 more...)

arXiv.org Machine Learning

1908.08118

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

Add feedback

Natural-Parameter Networks: A Class of Probabilistic Neural Networks

Wang, Hao, SHI, Xingjian, Yeung, Dit-Yan

Neural Information Processing SystemsDec-31-2016

Neural networks (NN) have achieved state-of-the-art performance in various applications. Unfortunatelyin applications where training data is insufficient, they are often prone to overfitting. One effective way to alleviate this problem is to exploit the Bayesian approach by using Bayesian neural networks (BNN). Another shortcoming ofNN is the lack of flexibility to customize different distributions for the weights and neurons according to the data, as is often done in probabilistic graphical models.To address these problems, we propose a class of probabilistic neural networks, dubbed natural-parameter networks (NPN), as a novel and lightweight Bayesian treatment of NN. NPN allows the usage of arbitrary exponential-family distributions to model the weights and neurons. Different from traditional NN and BNN, NPN takes distributions as input and goes through layers of transformation beforeproducing distributions to match the target output distributions. As a Bayesian treatment, efficient backpropagation (BP) is performed to learn the natural parameters for the distributions over both the weights and neurons. The output distributions of each layer, as byproducts, may be used as second-order representations for the associated tasks such as link prediction. Experiments on real-world datasets show that NPN can achieve state-of-the-art performance.

artificial intelligence, neural network, npn, (18 more...)

Neural Information Processing Systems

Country:

North America > United States (0.68)
North America > Canada > Ontario > Toronto (0.14)
Europe > Spain (0.14)

Industry: Energy > Oil & Gas (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Natural-Parameter Networks: A Class of Probabilistic Neural Networks

Wang, Hao, Shi, Xingjian, Yeung, Dit-Yan

arXiv.org Machine LearningNov-1-2016

Neural networks (NN) have achieved state-of-the-art performance in various applications. Unfortunately in applications where training data is insufficient, they are often prone to overfitting. One effective way to alleviate this problem is to exploit the Bayesian approach by using Bayesian neural networks (BNN). Another shortcoming of NN is the lack of flexibility to customize different distributions for the weights and neurons according to the data, as is often done in probabilistic graphical models. To address these problems, we propose a class of probabilistic neural networks, dubbed natural-parameter networks (NPN), as a novel and lightweight Bayesian treatment of NN. NPN allows the usage of arbitrary exponential-family distributions to model the weights and neurons. Different from traditional NN and BNN, NPN takes distributions as input and goes through layers of transformation before producing distributions to match the target output distributions. As a Bayesian treatment, efficient backpropagation (BP) is performed to learn the natural parameters for the distributions over both the weights and neurons. The output distributions of each layer, as byproducts, may be used as second-order representations for the associated tasks such as link prediction. Experiments on real-world datasets show that NPN can achieve state-of-the-art performance.

artificial intelligence, exp, neural network, (19 more...)

arXiv.org Machine Learning

1611.00448

Country:

North America > United States (0.67)
North America > Canada > Ontario > Toronto (0.14)
Europe > Spain (0.14)

Genre: Research Report (0.50)

Industry: Energy > Oil & Gas (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Stochastic Heavy Ball

Gadat, Sébastien, Panloup, Fabien, Saadane, Sofiane

arXiv.org Machine LearningOct-21-2016

This paper deals with a natural stochastic optimization procedure derived from the so-called Heavy-ball method differential equation, which was introduced by Polyak in the 1960s with his seminal contribution [Pol64]. The Heavy-ball method is a second-order dynamics that was investigated to minimize convex functions f . The family of second-order methods recently received a large amount of attention, until the famous contribution of Nesterov [Nes83], leading to the explosion of large-scale optimization problems. This work provides an in-depth description of the stochastic heavy-ball method, which is an adaptation of the deterministic one when only unbiased evalutions of the gradient are available and used throughout the iterations of the algorithm. We first describe some almost sure convergence results in the case of general non-convex coercive functions f . We then examine the situation of convex and strongly convex potentials and derive some non-asymptotic results about the stochastic heavy-ball method. We end our study with limit theorems on several rescaled algorithms.

algorithm, artificial intelligence, machine learning, (18 more...)

arXiv.org Machine Learning

1609.04228

Country:

Europe > France > Occitanie > Haute-Garonne > Toulouse (0.04)
North America > United States > New York (0.04)
North America > United States > Texas > Hays County > San Marcos (0.04)
(7 more...)

Genre: Research Report (0.63)

Industry: Leisure & Entertainment > Sports > Tennis (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback