AITopics

2605.21253

Country: Europe (0.68)

Genre: Research Report (0.63)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

De Castro, Yohann, Gadat, Sébastien, Marteau, Clément

Fast Spawn\&Prune (FS\&P): Global convergence of stochastic conic particle gradient descent via birth/death process

arXiv.org Machine LearningMay-20-2026

We investigate the global optimization of the objective function arising in continuous sparse regression, specifically the Beurling LASSO (BLASSO), over the space of measures. While Conic Particle Gradient Descent (CPGD) methods are computationally efficient, they may become trapped in local minima due to the non-convexity of the parameterization. To overcome this limitation, we introduce Fast Spawn\&Prune (FS\&P), a stochastic algorithm that extends FastPart introduced in De Castro et al. (2025) and combines CPGD with a birth-death process. The birth mechanism ensures asymptotic global exploration by introducing particles in regions where first-order optimality conditions are violated, while the death process preserves computational efficiency by pruning non-informative particles. We provide the first theoretical guarantee of global convergence for this class of discrete-time stochastic algorithms, without requiring exponentially large initializations. Furthermore, we derive explicit convergence rates for the excess risk, which scale as $\mathcal{O}\big(\left(\log K / K\right)^{\frac{1}{2(2+d)}}\big)$, where $K$ denotes the number of iterations and d the dimension of the domain, thereby quantifying the trade-off between global exploration and local refinement. Moreover, the sample complexity is $\mathcal{O}\big(N^{-\frac{1}{4(2+d)}}\big)$ (up to logarithmic factors). We also propose a horizon-free variant that does not require prior knowledge of the iteration budget.

artificial intelligence, assumption, machine learning, (18 more...)

2605.19784

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.34)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.34)

Neural Information Processing SystemsApr-30-2026, 10:09:28 GMT

fd8872fcba4ba87312cdfe5ebba91ca9-Supplemental-Conference.pdf

The appendix includes the missing proofs, detailed discussions of some argument in the main body483 and more numerical experiments. We organize the appendix as follows:484 The proof of infeasibility condition (Theorem 3.2) is provided in Section B.485 Explanations on conditions derived in Theorem 3.2 are included in Section C.486 The proof of properties of the proposed model (r)LogSpecT (Proposition 3.4 & 3.6) is given487 in Section D and some additional properties are discussed.488 The truncated Hausdorff distance based proof details of Theorem 4.1 and Corollary 4.4 are489 given in Section E.490 Details of L-ADMM and its convergence analysis are in Section F.491 Additional experiments and discussions on synthetic data are included in Section G.492 Since the linear system (4) has no solution, we know from Farkas' lemma that the following system494 Hence, S is also a solution to (13). However, (13) does not have a solution. We can conclude that504 rSpecT is infeasible in this case.505

artificial intelligence, low-pass parameter, rlogspect, (14 more...)

Technology: Information Technology > Artificial Intelligence (0.91)

Neural Information Processing SystemsApr-30-2026, 07:48:00 GMT

How to Turn Your Knowledge Graph Embeddings into Generative Models

Some of the most successful knowledge graph embedding (KGE) models for link prediction - CP, RESCAL, TUCKER, COMPLEX - can be interpreted as energy-based models. Under this perspective they are not amenable for exact maximum-likelihood estimation (MLE), sampling and struggle to integrate logical constraints. This work re-interprets the score functions of these KGEs as circuits - constrained computational graphs allowing efficient marginalisation. Then, we design two recipes to obtain efficient generative circuit models by either restricting their activations to be non-negative or squaring their outputs. Our interpretation comes with little or no loss of performance for link prediction, while the circuits framework unlocks exact learning by MLE, efficient sampling of new triples, and guarantee that logical constraints are satisfied by design.

artificial intelligence, logic & formal reasoning, machine learning, (22 more...)

Country: Europe (1.00)

Genre:

Instructional Material (0.46)
Research Report > New Finding (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.86)
(2 more...)

Geenens, Gery, de Micheaux, Pierre Lafaye, Zou, Ivan Muyun

Deep-testing: the case of dependence detection

arXiv.org Machine LearningApr-30-2026

Deep learning methods have proved highly effective for classification and image recognition problems. In this paper, we ask whether this success can be transferred to hypothesis testing: if a neural network can distinguish, for example, an image of a handwritten digit from another, can it also distinguish an "image of a sample" (such as a scatter plot) generated under a given statistical model from one generated outside that model? Motivated by this idea, we propose a novel procedure called deep-testing, which approaches the classical inferential problem of hypothesis testing through deep learning. More specifically, the test statistic is a classification map learned by a deep neural network from simulated data satisfying the null and alternative hypotheses, leveraging its strong discriminating power to construct a highly powerful test. As a proof of concept, we apply deep-testing to the problem of independence testing, arguably one of the most important problems in statistics. In a large-scale simulation study, deep-testing achieves the highest overall power against nineteen competing methods across a broad range of complex dependence structures, confirming the viability of the proposed approach.

artificial intelligence, machine learning, resp, (16 more...)

2604.26558

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.45)

Neural Information Processing SystemsApr-25-2026, 16:36:08 GMT

PuzzleFusion Unleashing the Power of Diffusion Models for Spatial Puzzle Solving

This paper presents an end-to-end neural architecture based on Diffusion Models for spatial puzzle solving, particularly jigsaw puzzle and room arrangement tasks. In the latter task, for instance, the proposed system takes a set of room layouts as polygonal curves in the top-down view and aligns the room layout pieces by estimating their 2D translations and rotations, akin to solving the jigsaw puzzle of room layouts. A surprising discovery of the paper is that the simple use of a Diffusion Model effectively solves these challenging spatial puzzle tasks as a conditional generation process. To enable learning of an end-to-end neural system, the paper introduces new datasets with ground-truth arrangements: 1) 2DVoronoi jigsaw dataset, a synthetic one where pieces are generated by Voronoi diagram of 2D pointset; and 2) MagicPlan dataset, a real one offered by MagicPlan from its production pipeline, where pieces are room layouts constructed by augmented reality App by real-estate consumers. The qualitative and quantitative evaluations demonstrate that our approach outperforms the competing methods by significant margins in all the tasks. We have provided code and data here.

artificial intelligence, machine learning, noise level, (16 more...)

Industry: Banking & Finance > Real Estate (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)

Neural Information Processing SystemsApr-25-2026, 01:34:21 GMT

Supplementary for Mixed Supervised Object Detection by Transferring Mask Prior and Semantic Similarity

In this supplementary material, we will provide more analyses of mask prior in Section 1 and similarity transfer in Section 2. We will show the visualization results in Section 3 and the performance variance with iteration in Section 4. We will also conduct experiments to mine base categories in the target dataset in Section 5. Besides, the hyper-parameters analyses will be provided in Section 6. Finally, we will discuss the limitations in Section 7. As mentioned in Section 3.2 in the main paper, mask prior provides coarse pixel-wise category information to improve the ability of the object detection network to locate and identify objects. Our ablation studies (Table 3 in the main paper) have already proved the advantage of mask prior. To further evaluate the effectiveness of mask prior, we evaluate object detection network with/without mask generator on VOC test set. Considering that the target dataset may contain both base categories and novel categories, in which only novel categories have ground-truth bounding boxes, we evaluate our method on novel categories.

category, machine learning, natural language, (18 more...)

Technology:

Information Technology > Artificial Intelligence > Vision (0.87)
Information Technology > Artificial Intelligence > Machine Learning (0.71)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.42)

Neural Information Processing SystemsApr-24-2026, 09:18:06 GMT

Causal Discovery in Linear Latent Variable Models Subject to Measurement Error

We focus on causal discovery in the presence of measurement error in linear systems where the mixing matrix, i.e., the matrix indicating the independent exogenous noise terms pertaining to the observed variables, is identified up to permutation and scaling of the columns. We demonstrate a somewhat surprising connection between this problem and causal discovery in the presence of unobserved parentless causes, in the sense that there is a mapping, given by the mixing matrix, between the underlying models to be inferred in these problems. Consequently, any identifiability result based on the mixing matrix for one model translates to an identifiability result for the other model. We characterize to what extent the causal models can be identified under a two-part faithfulness assumption. Under only the first part of the assumption (corresponding to the conventional definition of faithfulness), the structure can be learned up to the causal ordering among an ordered grouping of the variables but not all the edges across the groups can be identified. We further show that if both parts of the faithfulness assumption are imposed, the structure can be learned up to a more refined ordered grouping. As a result of this refinement, for the latent variable model with unobserved parentless causes, the structure can be identified. Based on our theoretical results, we propose causal structure learning methods for both models, and evaluate their performance on synthetic data.

artificial intelligence, equivalence class, machine learning, (17 more...)

Genre: Research Report (0.93)

Industry: Health & Medicine (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (1.00)

arXiv.org Machine LearningMar-19-2026

Shallow Representation of Option Implied Information

Lin, Jimin

Option prices encode the market's collective outlook through implied density and implied volatility. An explicit link between implied density and implied volatility translates the risk-neutrality of the former into conditions on the latter to rule out static arbitrage. Despite earlier recognition of their parity, the two had been studied in isolation for decades until the recent demand in implied volatility modeling rejuvenated such parity. This paper provides a systematic approach to build neural representations of option implied information. As a preliminary, we first revisit the explicit link between implied density and implied volatility through an alternative and minimalist lens, where implied volatility is viewed not as volatility but as a pointwise corrector mapping the Black-Scholes quasi-density into the implied risk-neutral density. Building on this perspective, we propose the neural representation that incorporates arbitrage constraints through the differentiable corrector. With an additive logistic model as the synthetic benchmark, extensive experiments reveal that deeper or wider network structures do not necessarily improve the model performance due to the nonlinearity of both arbitrage constraints and neural derivatives. By contrast, a shallow feedforward network with a single hidden layer and a specific activation effectively approximates implied density and implied volatility.

artificial intelligence, machine learning, volatility, (19 more...)

2603.17151

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > Spain > Galicia > Madrid (0.04)

Genre: Research Report (0.50)

Industry: Banking & Finance > Trading (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Neural Information Processing SystemsFeb-19-2026, 05:08:26 GMT

AdversarialReweightingforPartial DomainAdaptation

Theconventional closed-set DAmethods generally assume that the source and target domains share the same label space. However, this assumption is often not realistic in practice.

artificial intelligence, incvpr, machine learning, (16 more...)

Country:

Asia > Middle East > Jordan (0.04)
Asia > China (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)