AITopics | sinh

Collaborating Authors

sinh

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

ROOT: Rethinking Offline Optimization as Distributional Translation via Probabilistic Bridge

Neural Information Processing SystemsJun-19-2026, 16:26:41 GMT

machine learning, natural language, optimization, (18 more...)

Neural Information Processing Systems

Country: North America > United States (0.67)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Overview (0.87)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.67)
Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(3 more...)

Add feedback

Sampling from Flow Language Models via Marginal-Conditioned Bridges

Azangulov, Iskander, Zhang, Leo

arXiv.org Machine LearningMay-14-2026

Flow Language Models (FLMs) are a recently introduced class of language models which adapt continuous flow matching for one-hot encoded token sequences. Their denoisers have a special structure absent from generic continuous diffusion models: each block of the denoising mean is a posterior marginal distribution over the clean token at that position. Standard DDPM-style samplers collapse these marginals to a single conditional-mean endpoint and bridge toward this simplex-valued point, which is generally not a valid one-hot sequence. We argue that the natural sampler for an FLM is instead posterior-predictive. At each reverse step, we sample a clean one-hot endpoint from the factorized posterior defined by the FLM token marginals, and then sample the next continuous state from the analytic Ornstein--Uhlenbeck bridge conditioned on that endpoint. The method is training-free, uses the same model evaluations as standard sampling, and gives a principled interface for token-level decoding controls such as temperature scaling and nucleus truncation. We show that, under exact posterior marginals, the endpoint approximation error is exactly the conditional multi-information among token positions. The induced one-step bridge kernel preserves all token-wise posterior-predictive marginals and loses only the residual cross-position dependence. Finally, we prove a Girsanov path-space comparison showing that the marginal-conditioned bridge has a no-larger denoising-error term than the frozen conditional-mean bridge, with strict improvement whenever intermediate coordinate-wise bridge observations reveal additional information about the clean token. Experiments with FLMs show that the sampler improves the quality--diversity tradeoff. Code is available at: github.com/imbirik/mcb.

machine learning, natural language, sampler, (15 more...)

arXiv.org Machine Learning

2605.13681

Country: North America > United States (0.93)

Genre: Research Report (0.50)

Industry: Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

a8901c5e85fb8e1823bbf0f755053672-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-11-2026, 05:42:09 GMT

Unlike in Metamath or Lean, we do not have access to a training set of human annotated proofs for this environment.

artificial intelligence, cosh, machine learning, (18 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.88)

Add feedback

Ananalytictheoryofshallownetworksdynamicsfor hingelossclassification--SupplementaryMaterial

Neural Information Processing SystemsFeb-8-2026, 03:05:35 GMT

In physical systems a particle instead interacts only with a finite number of other particles, hence the density field remains highly fluctuating. The effect of theθ(w x) term is to select one particular half-space over which the integralisdone. To estimate the fluctuations due to a finite number of nodes, we will have to estimate the width of the output distribution for a given set of parameters. Toestimate the error inFigure 1d ofthe main text, we ask what are the values ofxk = xcosθ such that the average output plus or minus a standard deviation, divided by M, would be equal to the threshold. Since the standard deviation involves|x|2, we estimate its average value for points 3 with a givenxk, i.e.

artificial intelligence, hingelossclassification, machine learning, (17 more...)

Neural Information Processing Systems

Country:

Europe > France (0.05)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.05)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.47)

Add feedback

A Gap Between Decision Trees and Neural Networks

Kumar, Akash

arXiv.org Machine LearningJan-9-2026

We study when geometric simplicity of decision boundaries, used here as a notion of interpretability, can conflict with accurate approximation of axis-aligned decision trees by shallow neural networks. Decision trees induce rule-based, axis-aligned decision regions (finite unions of boxes), whereas shallow ReLU networks are typically trained as score models whose predictions are obtained by thresholding. We analyze the infinite-width, bounded-norm, single-hidden-layer ReLU class through the Radon total variation ($\mathrm{R}\mathrm{TV}$) seminorm, which controls the geometric complexity of level sets. We first show that the hard tree indicator $1_A$ has infinite $\mathrm{R}\mathrm{TV}$. Moreover, two natural split-wise continuous surrogates--piecewise-linear ramp smoothing and sigmoidal (logistic) smoothing--also have infinite $\mathrm{R}\mathrm{TV}$ in dimensions $d>1$, while Gaussian convolution yields finite $\mathrm{R}\mathrm{TV}$ but with an explicit exponential dependence on $d$. We then separate two goals that are often conflated: classification after thresholding (recovering the decision set) versus score learning (learning a calibrated score close to $1_A$). For classification, we construct a smooth barrier score $S_A$ with finite $\mathrm{R}\mathrm{TV}$ whose fixed threshold $τ=1$ exactly recovers the box. Under a mild tube-mass condition near $\partial A$, we prove an $L_1(P)$ calibration bound that decays polynomially in a sharpness parameter, along with an explicit $\mathrm{R}\mathrm{TV}$ upper bound in terms of face measures. Experiments on synthetic unions of rectangles illustrate the resulting accuracy--complexity tradeoff and how threshold selection shifts where training lands along it.

artificial intelligence, decision tree, machine learning, (17 more...)

arXiv.org Machine Learning

2601.03919

Country: North America > United States > California (0.28)

Genre: Research Report (0.41)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Decomposable Neuro Symbolic Regression

Morales, Giorgio, Sheppard, John W.

arXiv.org Artificial IntelligenceNov-7-2025

Symbolic regression (SR) models complex systems by discovering mathematical expressions that capture underlying relationships in observed data. However, most SR methods prioritize minimizing prediction error over identifying the governing equations, often producing overly complex or inaccurate expressions. To address this, we present a decomposable SR method that generates interpretable multivariate expressions leveraging transformer models, genetic algorithms (GAs), and genetic programming (GP). In particular, our explainable SR method distills a trained ``opaque'' regression model into mathematical expressions that serve as explanations of its computed function. Our method employs a Multi-Set Transformer to generate multiple univariate symbolic skeletons that characterize how each variable influences the opaque model's response. We then evaluate the generated skeletons' performance using a GA-based approach to select a subset of high-quality candidates before incrementally merging them via a GP-based cascade procedure that preserves their original skeleton structure. The final multivariate skeletons undergo coefficient optimization via a GA. We evaluated our method on problems with controlled and varying degrees of noise, demonstrating lower or comparable interpolation and extrapolation errors compared to two GP-based methods, three neural SR methods, and a hybrid approach. Unlike them, our approach consistently learned expressions that matched the original mathematical structure.

evolutionary algorithm, machine learning, tanh, (18 more...)

arXiv.org Artificial Intelligence

2511.04124

Country: North America > United States (0.45)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.92)

Add feedback

On the flow matching interpretability

Pivi, Francesco, Gazza, Simone, Evangelista, Davide, Amadini, Roberto, Gabbrielli, Maurizio

arXiv.org Artificial IntelligenceOct-27-2025

Generative models based on flow matching have demonstrated remarkable success in various domains, yet they suffer from a fundamental limitation: the lack of interpretability in their intermediate generation steps. In fact these models learn to transform noise into data through a series of vector field updates, however the meaning of each step remains opaque. We address this problem by proposing a general framework constraining each flow step to be sampled from a known physical distribution. Flow trajectories are mapped to (and constrained to traverse) the equilibrium states of the simulated physical process. We implement this approach through the 2D Ising model in such a way that flow steps become thermal equilibrium points along a parametric cooling schedule. Our proposed architecture includes an encoder that maps discrete Ising configurations into a continuous latent space, a flow-matching network that performs temperature-driven diffusion, and a projector that returns to discrete Ising states while preserving physical constraints. We validate this framework across multiple lattice sizes, showing that it preserves physical fidelity while outperforming Monte Carlo generation in speed as the lattice size increases. In contrast with standard flow matching, each vector field represents a meaningful stepwise transition in the 2D Ising model's latent space. This demonstrates that embedding physical semantics into generative flows transforms opaque neural trajectories into interpretable physical processes.

artificial intelligence, interpretability, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2510.2121

Country: Europe > Italy (0.15)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

ROOT: Rethinking Offline Optimization as Distributional Translation via Probabilistic Bridge

Dao, Manh Cuong, Tran, The Hung, Nguyen, Phi Le, Truong, Thao Nguyen, Hoang, Trong Nghia

arXiv.org Artificial IntelligenceOct-24-2025

This paper studies the black-box optimization task which aims to find the maxima of a black-box function using a static set of its observed input-output pairs. This is often achieved via learning and optimizing a surrogate function with that offline data. Alternatively, it can also be framed as an inverse modeling task that maps a desired performance to potential input candidates that achieve it. Both approaches are constrained by the limited amount of offline data. To mitigate this limitation, we introduce a new perspective that casts offline optimization as a distributional translation task. This is formulated as learning a probabilistic bridge transforming an implicit distribution of low-value inputs (i.e., offline data) into another distribution of high-value inputs (i.e., solution candidates). Such probabilistic bridge can be learned using low- and high-value inputs sampled from synthetic functions that resemble the target function. These synthetic functions are constructed as the mean posterior of multiple Gaussian processes fitted with different parameterizations on the offline data, alleviating the data bottleneck. The proposed approach is evaluated on an extensive benchmark comprising most recent methods, demonstrating significant improvement and establishing a new state-of-the-art performance. Our code is publicly available at https://github.com/cuong-dm/ROOT.

machine learning, natural language, optimization, (19 more...)

arXiv.org Artificial Intelligence

2509.163

Country: North America > United States (0.46)

Genre: