AITopics | Europe

Collaborating Authors

Europe

On the Expressiveness of Approximate Inference in Bayesian Neural Networks

Neural Information Processing SystemsMay-31-2025, 14:12:18 GMT

While Bayesian neural networks (BNNs) hold the promise of being flexible, wellcalibrated statistical models, inference often requires approximations whose consequences are poorly understood. We study the quality of common variational methods in approximating the Bayesian predictive distribution. For single-hidden layer ReLU BNNs, we prove a fundamental limitation in function-space of two of the most commonly used distributions defined in weight-space: mean-field Gaussian and Monte Carlo dropout. We find there are simple cases where neither method can have substantially increased uncertainty in between well-separated regions of low uncertainty. We provide strong empirical evidence that exact inference does not have this pathology, hence it is due to the approximation and not the model. In contrast, for deep networks, we prove a universality result showing that there exist approximate posteriors in the above classes which provide flexible uncertainty estimates. However, we find empirically that pathologies of a similar form as in the single-hidden layer case can persist when performing variational inference in deeper networks. Our results motivate careful consideration of the implications of approximate inference methods in BNNs.

artificial intelligence, bayesian inference, machine learning, (17 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > Wisconsin (0.14)
North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report > New Finding (0.54)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Handling correlated and repeated measurements with the smoothed multivariate square-root Lasso

Neural Information Processing SystemsMay-31-2025, 14:04:29 GMT

A limitation of Lasso-type estimators is that the optimal regularization parameter depends on the unknown noise level. Estimators such as the concomitant Lasso address this dependence by jointly estimating the noise level and the regression coefficients. Additionally, in many applications, the data is obtained by averaging multiple measurements: this reduces the noise variance, but it dramatically reduces sample sizes and prevents refined noise modeling. In this work, we propose a concomitant estimator that can cope with complex noise structure by using nonaveraged measurements, its data-fitting term arising as a smoothing of the nuclear norm. The resulting optimization problem is convex and amenable, thanks to smoothing theory, to state-of-the-art optimization techniques that leverage the sparsity of the solutions. Practical benefits are demonstrated on toy datasets, realistic simulated data and real neuroimaging data.

artificial intelligence, machine learning, repetition, (18 more...)

Neural Information Processing Systems

Country:

Europe > France (0.14)
North America > Canada (0.14)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.95)

Add feedback

Multi-language Diversity Benefits Autoformalization

Neural Information Processing SystemsMay-31-2025, 13:59:20 GMT

Autoformalization is the task of translating natural language materials into machineverifiable formalisations. Progress in autoformalization research is hindered by the lack of a sizeable dataset consisting of informal-formal pairs expressing the same essence. Existing methods tend to circumvent this challenge by manually curating small corpora or using few-shot learning with large language models. But these methods suffer from data scarcity and formal language acquisition difficulty. In this work, we create MMA, a large, flexible, multi-language, and multi-domain dataset of informal-formal pairs, by using a language model to translate in the reverse direction, that is, from formal mathematical statements into corresponding informal ones. Experiments show that language models fine-tuned on MMA can produce up to 29 31% of statements acceptable with minimal corrections on the miniF2F and ProofNet benchmarks, up from 0% with the base model. We demonstrate that fine-tuning on multi-language formal data results in more capable autoformalization models even on single-language tasks.

large language model, logic & formal reasoning, natural language, (20 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > Austria > Vienna (0.14)
North America > United States > Hawaii (0.14)
North America > United States > California (0.14)

Genre: Research Report > Experimental Study (1.00)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.93)

Add feedback

Cooperation, Competition, and Maliciousness: LLM-Stakeholders Interactive Negotiation

Neural Information Processing SystemsMay-31-2025, 13:59:00 GMT

There is a growing interest in using Large Language Models (LLMs) in multi-agent systems to tackle interactive real-world tasks that require effective collaboration and assessment of complex situations. Yet, we have a limited understanding of LLMs' communication and decision-making abilities in multi-agent setups.

large language model, machine learning, natural language, (21 more...)

Neural Information Processing Systems

Country: Europe > United Kingdom (0.14)

Genre: Research Report > New Finding (0.67)

Industry:

Leisure & Entertainment > Games (1.00)
Law (1.00)
Government (1.00)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Towards Scale-Invariant Graph-related Problem Solving by Iterative Homogeneous Graph Neural Networks

Neural Information Processing SystemsMay-31-2025, 13:48:24 GMT

Taking the perspective of synthesizing graph theory programs, we propose several extensions to address the issue. First, inspired by the dependency of the iteration number of common graph theory algorithms on graph size, we learn to terminate the message passing process in GNNs adaptively according to the computation progress. Second, inspired by the fact that many graph theory algorithms are homogeneous with respect to graph weights, we introduce homogeneous transformation layers that are universal homogeneous function approximators, to convert ordinary GNNs to be homogeneous. Experimentally, we show that our GNN can be trained from small-scale graphs but generalize well to large-scale graphs for a number of basic graph theory problems. It also shows generalizability for applications of multi-body physical simulation and image-based navigation problems.

artificial intelligence, graph, machine learning, (16 more...)

Neural Information Processing Systems

Country:

Asia > China (0.14)
North America > Canada (0.14)
Europe > Sweden (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Policy Continuation with Hindsight Inverse Dynamics

Hao Sun, Zhizhong Li, Xiaotong Liu, Bolei Zhou, Dahua Lin

Neural Information Processing SystemsMay-31-2025, 13:43:43 GMT

Solving goal-oriented tasks is an important but challenging problem in reinforcement learning (RL). For such tasks, the rewards are often sparse, making it difficult to learn a policy effectively. To tackle this difficulty, we propose a new approach called Policy Continuation with Hindsight Inverse Dynamics (PCHID). This approach learns from Hindsight Inverse Dynamics based on Hindsight Experience Replay. Enabling the learning process in a self-imitated manner and thus can be trained with supervised learning. This work also extends it to multi-step settings with Policy Continuation. The proposed method is general, which can work in isolation or be combined with other on-policy and off-policy algorithms.

artificial intelligence, machine learning, reinforcement learning, (11 more...)

Neural Information Processing Systems

Country:

Asia (0.28)
North America > Canada (0.14)
Europe > France (0.14)

Industry: Leisure & Entertainment > Games > Computer Games (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Non-Stationary Learning of Neural Networks with Automatic Soft Parameter Reset

Neural Information Processing SystemsMay-31-2025, 13:41:20 GMT

Neural networks are most often trained under the assumption that data come from a stationary distribution.

artificial intelligence, bayesian inference, machine learning, (15 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > United Kingdom > England (0.14)
Europe > Italy (0.14)
Europe > France (0.14)

Genre: Research Report > Experimental Study (0.92)

Industry: Education > Educational Setting > Online (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Recurrent Space-time Graph Neural Networks

Andrei Nicolicioiu, Iulia Duta, Marius Leordeanu

Neural Information Processing SystemsMay-31-2025, 13:38:46 GMT

Learning in the space-time domain remains a very challenging problem in machine learning and computer vision. Current computational models for understanding spatio-temporal visual data are heavily rooted in the classical single-image based paradigm. It is not yet well understood how to integrate information in space and time into a single, general model. We propose a neural graph model, recurrent in space and time, suitable for capturing both the local appearance and the complex higher-level interactions of different entities and objects within the changing world scene. Nodes and edges in our graph have dedicated neural networks for processing information.

artificial intelligence, information, machine learning, (19 more...)

Neural Information Processing Systems

Country:

Europe > Romania (0.14)
North America > Canada (0.14)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Towards a theory of how the structure of language is acquired by deep neural networks

Neural Information Processing SystemsMay-31-2025, 13:37:31 GMT

How much data is required to learn the structure of a language via next-token prediction? We study this question for synthetic datasets generated via a Probabilistic Context-Free Grammar (PCFG)--a tree-like generative model that captures many of the hierarchical structures found in natural languages. We determine token-token correlations analytically in our model and show that they can be used to build a representation of the grammar's hidden variables, the longer the range the deeper the variable. In addition, a finite training set limits the resolution of correlations to an effective range, whose size grows with that of the training set. As a result, a Language Model trained with increasingly many examples can build a deeper representation of the grammar's structure, thus reaching good performance despite the high dimensionality of the problem. We conjecture that the relationship between training set size and effective range of correlations holds beyond our synthetic datasets. In particular, our conjecture predicts how the scaling law for the test loss behaviour with training set size depends on the length of the context window, which we confirm empirically in Shakespeare's plays and Wikipedia articles.

correlation, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country:

Europe > Italy (0.14)
Europe > Belgium (0.14)

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Self-supervised Transformation Learning for Equivariant Representations Dong-Jae Lee

Neural Information Processing SystemsMay-31-2025, 13:33:53 GMT

Unsupervised representation learning has significantly advanced various machine learning tasks. In the computer vision domain, state-of-the-art approaches utilize transformations like random crop and color jitter to achieve invariant representations, embedding semantically the same inputs despite transformations.

artificial intelligence, machine learning, natural language, (15 more...)

Neural Information Processing Systems

Country: Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report > Experimental Study (0.93)

Industry: Information Technology (0.93)

Technology: