AITopics

Design, Cups, and Blankets. A Free-Energy-Principle-Based Approach to Product Design

Possati, Luca M.

Classical design theory treats the type of an object as a given: the designer decides in advance that this will be a cup, then optimizes its parameters. This paper argues that object type is not a presupposition but an inference, something that can be determined from physical data and functional requirements jointly. We call this problem requirement-steered interface type inference and show that it is inexpressible within existing design frameworks. This paper makes two contributions that are jointly necessary and individually incomplete. The first is the problem itself, which classical design cannot pose because it presupposes the very thing our problem seeks to determine. The second is C-DMBD, a constrained extension of the Dynamic Markov Blanket Detection algorithm, which makes requirement-steered inference computationally tractable. Drawing on the free-energy principle and active inference, established frameworks in theoretical neuroscience and Bayesian mechanics, we model a product's surface as a Markov blanket: the minimal boundary through which all causal exchange between object and environment must pass. Different blanket structures correspond to different object types; different parameterizations of the same structure correspond to different functional modes of the same type. This paper is a proof of concept and a theoretical proposal. It reframes design as inference rather than optimization, and as a relation between generative models rather than a specification of parameters.

artificial intelligence, bayesian inference, machine learning, (19 more...)

2604.22902

Country: Europe (0.28)

Genre: Research Report (0.70)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.34)

A Continuous-Time Ensemble Kalman-Bucy Smoother for Causal Inference and Model Discovery

Jiang, Zhang, Andreou, Marios, Reich, Sebastian, Chen, Nan

Data assimilation (DA) integrates observational information with model predictions to improve state estimation in complex systems. While filtering provides the basis for online forecasts by using only past and present observations, it can exhibit delays and biases when the underlying dynamics evolve rapidly or undergo regime transitions. Smoothing, which additionally incorporates future observations, provides a natural pipeline for hindcasting and reanalysis that yields an uncertainty reduction beyond the filter. This paper introduces an ensemble Kalman-Bucy smoother (EnKBS) for continuous-time DA of nonlinear dynamical systems, where the smoother's conditional distributions are reconstructed using ensemble moments. The result is a derivative-free framework that does not require explicit computation of tangent-linear or adjoint models, which converges to the exact smoother solution at the infinite-ensemble limit for a wide class of complex systems. Incorporating standard regularization techniques for high-dimensional systems, such as covariance localization and inflation, the skill of the EnKBS is demonstrated in various important scientific problems. By integrating future observations, which reveal the underlying causal mechanisms for retrospective state updates, the EnKBS is used for Bayesian-based inference of causal relationships and their temporal influence range in a dyadic trigger-feedback model and the development of a causality-driven iterative learning algorithm that identifies the structure and recovers the hidden parameters of a nonlinear reduced-order model mimicking midlatitude atmospheric circulation. Notably, both tasks remain effective with an ensemble size of $O(10)$ under partial observations, suggesting that EnKBS can support the instantaneous discovery of high-dimensional complex systems over time.

artificial intelligence, enkbs, machine learning, (18 more...)

2604.25157

Country:

Europe > United Kingdom > England (0.46)
North America > United States > New York (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Tang, Wei-Ting, Kudva, Akshay, Tsay, Calvin, Paulson, Joel A.

An Efficient Spatial Branch-and-Bound Algorithm for Global Optimization of Gaussian Process Posterior Mean Functions

We study the deterministic global optimization of trained Gaussian process posterior mean functions over hyperrectangular domains. Although the posterior mean function has a compact closed-form representation, its global optimization is challenging because it remains nonlinear and nonconvex. Existing exact deterministic approaches become increasingly difficult to scale as the number of training data points grows, leading to approximation-based methods that improve tractability by optimizing a modified (inexact) objective. In this work, we propose PALM-Mean, a piecewise-analytic lower-bounding framework embedded in reduced-space spatial branch-and-bound. At each node, kernel terms that are locally important are replaced by a sign-aware piecewise-linear relaxation in an appropriate scalar distance variable, while the remaining terms are bounded analytically in closed form. We show this hybrid approach yields a valid lower bound for the posterior mean, while limiting the size of the branch-and-bound subproblems. We establish validity of the node lower bounds and $\varepsilon$-global convergence of the resulting algorithm. Computational results on synthetic benchmarks and real-world application problems show that PALM-Mean improves scalability relative to representative general-purpose deterministic global solvers, particularly as the number of training data points increases.

artificial intelligence, optimization, optimization problem, (18 more...)

2605.00855

Country:

North America > United States (1.00)
Europe (0.93)

Genre: Research Report > New Finding (0.46)

Industry: Materials > Chemicals (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)

Mathys, Christoph, Legrand, Nicolas, Waade, Peter Thestrup, Mikus, Nace, Weber, Lilian Aline

Robust volatility updates for Hierarchical Gaussian Filtering

Hierarchical Gaussian Filtering (HGF) networks allow for efficient updating of posterior distributions (beliefs) about hidden states of an agent's environment. HGF parent nodes can target the mean or variance of their children. New information entering at input nodes leads to a cascade of belief updates across the network according to one-step update equations for each node's mean and precision (inverse variance). However, the original form of the update equations for variance-targeting parents(volatility coupling) can in some regions of parameter space lead to negative posterior precision, a logical impossibility which causes the updating algorithm to terminate with an error. In this report, we introduce a modified quadratic approximation to the variational energy of volatility-coupled nodes that avoids negative posterior precision. The key idea is to interpolate between two quadratic expansions of the variational energy: one at the prior prediction and one at a second mode whose location is obtained in closed form via the Lambert W function. The resulting update equations are robust across the entire parameter space and faithfully track the variational posterior even for large prediction errors.

artificial intelligence, machine learning, variational energy, (17 more...)

2605.00966

Country: Europe (0.46)

Genre: Research Report (0.50)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Diffusion Operator Geometry of Feedforward Representations

Reddy, Kanishka

Neural networks transform data through learned representations whose geometry affects separation, contraction, and generalization. Recent work studies this geometry using discrete curvature on neighborhood graphs, suggesting Ricci-flow-like behavior across layers. We develop a smooth operator-theoretic alternative for feedforward representation snapshots. Each feature cloud induces a Gaussian-kernel diffusion Markov operator, and transport, spectral, label-boundary, and local-scale observables are derived from this single object via Bakry-Emery $Γ$-calculus. In a balanced Gaussian class-conditional snapshot model with shared covariance, the population operator has closed-form class affinities, leakage, and coarse spectra, all controlled by pairwise regularized Mahalanobis separations $c_\varepsilon^{(a,b)}$. We also prove that the resulting operator observables vary smoothly under feature perturbations, while hard neighborhood-graph diagnostics can change discontinuously. Synthetic experiments validate the closed-form Gaussian bridge, while learned MNIST experiments show that the same operator observables track training, width, and perturbation stability. Together, these results give a stable operator-geometric framework for analyzing feedforward representation geometry.

artificial intelligence, machine learning, perturbation, (16 more...)

2605.01107

Country: North America > United States (0.28)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Topological Neural Tangent Kernel

Krishnagopal, Sanjukta

Graph neural tangent kernels give a principled infinite-width theory for graph neural networks, but inherit a basic limitation of graph models: they see only pairwise structure. Many relational systems contain higher-order interactions that are more naturally represented by simplicial complexes. We introduce the Topological Neural Tangent Kernel (TopoNTK), an infinite-width kernel for simplicial message passing on edge features. TopoNTK combines lower Hodge interactions, capturing graph-like coupling through shared vertices, with upper Hodge interactions, capturing coupling through filled simplices. This makes the kernel sensitive to topology invisible to graph kernels, allowing complexes with the same graph but different filled simplices to induce different kernels. Beyond expressivity, the Hodge structure gives the kernel an interpretable learning geometry. Edge signals decompose into gradient-like, harmonic, and local circulation components, and the spectrum of the TopoNTK determines how quickly each component is learned. This yields a topological form of spectral bias: components aligned with large-eigenvalue modes are learned quickly, while global harmonic modes, retained through the residual channel, often lie at smaller eigenvalues and are learned more slowly. We prove expressivity, Hodge-alignment, spectral learning, and stability properties, and validate them on synthetic simplicial tasks and DBLP higher-order link prediction. The results show that topology is not merely extra structure; it can provide coordinates that make relational learning more faithful, interpretable, and effective.

artificial intelligence, kernel, machine learning, (18 more...)

2605.0111

Country: North America > United States (0.28)

Genre: Research Report (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)

Spectral Graph Sparsification Preserves Representation Geometry in Graph Neural Networks

Krishnagopal, Sanjukta

Spectral graph sparsification is a classical tool for reducing graph complexity while preserving Laplacian quadratic forms. In graph neural networks (GNNs), sparsification is often used to accelerate computation while maintaining predictive performance. In this work, we study a complementary representation-level question: does sparsification preserve the geometry of learned embeddings? For polynomial-filter GNNs, we prove that any $ε$-spectral sparsifier induces $O(ε)$ perturbations in polynomial graph filters, multilayer hidden representations, and their Gram matrices. These guarantees imply stability of squared pairwise distances, class means, and covariance structure in embedding space. We further establish finite-time training stability: under smoothness and boundedness assumptions, gradient descent on dense and sparsified graphs produces weight trajectories whose separation grows at most proportionally to the sparsification distortion. Empirically, effective-resistance sparsification validates the predicted perturbation chain on synthetic graphs and preserves hidden representation geometry on real datasets. In our experiments, the gram matrix and training dynamics show low divergence even under substantial sparsification, consistent with the predicted stability under spectral sparsification. Hidden Gram preservation strongly predicts neighborhood preservation and class-centroid stability across FashionMNIST, Cora, and Paul15. Together, these results show that spectral sparsification preserves not only graph operators, but also the representation geometry that supports downstream use of GNN embeddings for interpretability.

artificial intelligence, machine learning, sparsification, (18 more...)

2605.01136

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.86)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

A Theory of Generalization in Deep Learning

Litman, Elon, Guo, Gabe

We present a non-asymptotic theory of generalization in deep learning where the empirical neural tangent kernel partitions the output space. In directions corresponding to signal, error dissipates rapidly; in the vast orthogonal dimensions corresponding to noise, the kernel's near-zero eigenvalues trap residual error in a test-invisible reservoir. Within the signal channel, minibatch SGD ensures that coherent population signal accumulates via fast linear drift, while idiosyncratic memorization is suppressed into a slow, diffusive random walk. We prove generalization survives even when the kernel evolves $\mathcal{O}(1)$ in operator norm, the full feature-learning regime. This theory naturally explains disparate phenomena in deep learning theory, such as benign overfitting, double descent, implicit bias, and grokking. Lastly, we derive an exact population-risk objective from a single training run with no validation data, for any architecture, loss, or optimizer, and prove that it measures precisely the noise in the signal channel. This objective reduces in practice to an SNR preconditioner on top of Adam, adding one state vector at no extra cost; it accelerates grokking by $5 \times$, suppresses memorization in PINNs and implicit neural representations, and improves DPO fine-tuning under noisy preferences while staying $3 \times$ closer to the reference policy.

artificial intelligence, machine learning, operator, (19 more...)

2605.01172

Country: North America > United States (0.28)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.90)

Maurya, Sunil Kumar, Liu, Xin

Evaluating LLMs on Large-Scale Graph Property Estimation via Random Walks

With the rapidly improving reasoning abilities of Large Language Models (LLMs), there is also a rising demand to use them in a wide variety of domains. This brings about the need to carefully evaluate the limits of the capabilities of these models with various tests and benchmarks. Graph structures are ubiquitous in real-world data, and are often used to represent and analyze relationship patterns within data. Many benchmarks have already been proposed in the graph literature to test the reasoning ability of LLMs to follow and execute graph algorithms. However, due to the limited context length of LLMs, these benchmarks consist of very small graphs. In real-world data, the size of graphs can be significantly larger, and in many cases, not fully accessible. In this paper, we examine a class of problems that arises with very large graphs having limited accessibility. We propose a large graph benchmark dataset, EstGraph, and introduce four distinct tasks designed to estimate large graph properties. We evaluate the reasoning abilities of LLMs on these tasks using a wide variety of graph datasets. In addition, we provide task-specific prompt constructions based on random walk sampling of large graphs (up to millions of nodes) that effectively convey sufficient information to LLMs within the limits of context length.

large language model, machine learning, natural language, (21 more...)

2605.01484

Country: North America > United States > California (0.27)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)