AITopics

Country: Europe (0.28)

Genre:

Research Report > Experimental Study (1.00)
Overview (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Neural Information Processing SystemsApr-25-2026, 15:31:10 GMT

Polyhedron Attention Module: Learning Adaptive-order Interactions

Learning feature interactions can be the key for multivariate predictive modeling. ReLU-activated neural networks create piecewise linear prediction models. Other nonlinear activation functions lead to models with only high-order feature interactions, thus lacking of interpretability. Recent methods incorporate candidate polynomial terms of fixed orders into deep learning, which is subject to the issue of combinatorial explosion, or learn the orders that are difficult to adapt to different regions of the feature space. We propose a Polyhedron Attention Module (PAM) to create piecewise polynomial models where the input space is split into polyhedrons which define the different pieces and on each piece the hyperplanes that define the polyhedron boundary multiply to form the interactive terms, resulting in interactions of adaptive order to each piece. PAM is interpretable to identify important interactions in predicting a target. Theoretic analysis shows that PAM has stronger expression capability than ReLU-activated networks. Extensive experimental results demonstrate the superior classification performance of PAM on massive datasets of the click-through rate prediction and PAM can learn meaningful interaction effects in a medical problem.

artificial intelligence, deep learning, machine learning, (19 more...)

Country: North America > United States (0.46)

Genre: Research Report > New Finding (0.48)

Industry: Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Neural Information Processing SystemsApr-25-2026, 07:57:51 GMT

2f4b6febe0b70805c3be75e5d6a66918-Supplemental-Conference.pdf

artificial intelligence, machine learning, subset, (16 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Xu, Danru, Lachapelle, Sébastien, Magliacane, Sara

Identifiability of Potentially Degenerate Gaussian Mixture Models With Piecewise Affine Mixing

arXiv.org Machine LearningApr-16-2026

Causal representation learning (CRL) aims to identify the underlying latent variables from high-dimensional observations, even when variables are dependent with each other. We study this problem for latent variables that follow a potentially degenerate Gaussian mixture distribution and that are only observed through the transformation via a piecewise affine mixing function. We provide a series of progressively stronger identifiability results for this challenging setting in which the probability density functions are ill-defined because of the potential degeneracy. For identifiability up to permutation and scaling, we leverage a sparsity regularization on the learned representation. Based on our theoretical results, we propose a two-stage method to estimate the latent variables by enforcing sparsity and Gaussianity in the learned representations. Experiments on synthetic and image data highlight our method's effectiveness in recovering the ground-truth latent variables.

artificial intelligence, identifiability, machine learning, (15 more...)

2604.13218

Country:

Asia > Japan > Honshū > Tōhoku > Iwate Prefecture > Morioka (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)
North America > Canada > Quebec > Montreal (0.04)
(2 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.92)

Neural Information Processing SystemsFeb-8-2026, 15:35:12 GMT

Polyhedron Attention Module: Learning Adaptive-order Interactions

Learning feature interactions can be the key for multivariate predictive modeling. ReLU-activated neural networks create piecewise linear prediction models.

artificial intelligence, interaction, machine learning, (18 more...)

Country: North America > United States > Connecticut (0.04)

Industry: Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Baker, Gregory D., McCallum, Scott, Pattinson, Dirk

Linear Regression in p-adic metric spaces

arXiv.org Artificial IntelligenceOct-2-2025

Many real-world machine learning problems involve inherently hierarchical data, yet traditional approaches rely on Euclidean metrics that fail to capture the discrete, branching nature of hierarchical relationships. We present a theoretical foundation for machine learning in p-adic metric spaces, which naturally respect hierarchical structure. Our main result proves that an n-dimensional plane minimizing the p-adic sum of distances to points in a dataset must pass through at least n + 1 of those points -- a striking contrast to Euclidean regression that highlights how p-adic metrics better align with the discrete nature of hierarchical data. As a corollary, a polynomial of degree n constructed to minimise the p-adic sum of residuals will pass through at least n + 1 points. As a further corollary, a polynomial of degree n approximating a higher degree polynomial at a finite number of points will yield a difference polynomial that has distinct rational roots. We demonstrate the practical significance of this result through two applications in natural language processing: analyzing hierarchical taxonomies and modeling grammatical morphology. These results suggest that p-adic metrics may be fundamental to properly handling hierarchical data structures in machine learning. In hierarchical data, interpolation between points often makes less sense than selecting actual observed points as representatives.

artificial intelligence, machine learning, polynomial, (15 more...)

arXiv.org Artificial Intelligence

2510.00043

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.62)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.53)

Neural Information Processing SystemsSep-24-2025, 19:17:03 GMT

142cdba4b8d1e03f9ee131ac86bb0afc-Paper-Conference.pdf

arxiv preprint arxiv, dynamic model, neural network, (13 more...)

Country: North America > United States (0.93)

Genre: Research Report > New Finding (0.93)

Industry: Energy (0.32)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Meister, Daniel, Harada, Takahiro

Geometric Integration for Neural Control Variates

arXiv.org Machine LearningSep-22-2025

Thanks to our geometric subdivision, we can integrate the neural network analytically, and use it as a control variate for Monte Carlo integration. The integral of the approximation provides a biased estimate (left), which is corrected by Monte Carlo integration of the residual integrand (center left), obtaining the final unbiased estimate (center right), which can achieve a lower error than vanilla Monte Carlo (right). Abstract Control variates are a variance-reduction technique for Monte Carlo integration. The principle involves approximating the integrand by a function that can be analytically integrated, and integrating using the Monte Carlo method only the residual difference between the integrand and the approximation, to obtain an unbiased estimate. Neural networks are universal approx-imators that could potentially be used as a control variate. However, the challenge lies in the analytic integration, which is not possible in general. In this manuscript, we study one of the simplest neural network models, the multilayered perceptron (MLP) with continuous piecewise linear activation functions, and its possible analytic integration. W e propose an integration method based on integration domain subdivision, employing techniques from computational geometry to solve this problem in 2D. W e demonstrate that an MLP can be used as a control variate in combination with our integration method, showing applications in the light transport simulation. 1. Introduction To synthesize photorealistic images, we need to solve notoriously complex integrals that model the underlying light transport. In general, these integrals do not have an analytic solution, and thus we employ tools of numerical integration to solve them. Among those, Monte Carlo integration is prominent, providing a general and robust solution, efficiently dealing with, for example, high dimensions or discontinuities that other numerical methods may struggle with. Monte Carlo converges to a correct solution with an increasing number of samples; however, it may require a large number of samples to suppress variance, that otherwise exhibits as high-frequency noise in the rendered images, under an acceptable threshold.

affine function, control variate, integration, (13 more...)

2509.15538

Country: North America > United States > California > Santa Clara County > Santa Clara (0.04)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.34)

arXiv.org Machine LearningJun-24-2025

Identifiable Convex-Concave Regression via Sub-gradient Regularised Least Squares

Chung, William

We propose a novel nonparametric regression method that models complex input-output relationships as the sum of convex and concave components. The method-Identifiable Convex-Concave Nonparametric Least Squares (ICCNLS)-decomposes the target function into additive shape-constrained components, each represented via sub-gradient-constrained affine functions. To address the affine ambiguity inherent in convex-concave decompositions, we introduce global statistical orthogonality constraints, ensuring that residuals are uncorrelated with both intercept and input variables. This enforces decomposition identifiability and improves interpretability. We further incorporate L1, L2 and elastic net regularisation on sub-gradients to enhance generalisation and promote structural sparsity. The proposed method is evaluated on synthetic and real-world datasets, including healthcare pricing data, and demonstrates improved predictive accuracy and model simplicity compared to conventional CNLS and difference-of-convex (DC) regression approaches. Our results show that statistical identifiability, when paired with convex-concave structure and sub-gradient regularisation, yields interpretable models suited for forecasting, benchmarking, and policy evaluation.

artificial intelligence, constraint, machine learning, (18 more...)

2506.18078

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > New York (0.04)
Asia > China > Hong Kong > Kowloon (0.04)

Genre: Research Report > New Finding (0.68)

Industry:

Health & Medicine (0.48)
Energy (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Ko, Joohwan, Domke, Justin

Model Informed Flows for Bayesian Inference of Probabilistic Programs

arXiv.org Machine LearningJun-2-2025

Variational inference often struggles with the posterior geometry exhibited by complex hierarchical Bayesian models. Recent advances in flow-based variational families and Variationally Inferred Parameters (VIP) each address aspects of this challenge, but their formal relationship is unexplored. Here, we prove that the combination of VIP and a full-rank Gaussian can be represented exactly as a forward autoregressive flow augmented with a translation term and input from the model's prior. Guided by this theoretical insight, we introduce the Model-Informed Flow (MIF) architecture, which adds the necessary translation mechanism, prior information, and hierarchical ordering. Empirically, MIF delivers tighter posterior approximations and matches or exceeds state-of-the-art performance across a suite of hierarchical and non-hierarchical benchmarks.

artificial intelligence, forward autoregressive flow, machine learning, (18 more...)

2505.24243

Country:

North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > Experimental Study (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.71)