AITopics

2603.27864

Country:

North America > United States > Texas (0.04)
North America > United States > New York > New York County > New York City (0.04)
Asia > Middle East > Jordan (0.04)

Genre:

Overview (0.68)
Research Report > Experimental Study (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Data Science > Data Mining (0.88)

Carriere, Mathieu, Ike, Yuichi, Lacombe, Théo, Nishikawa, Naoki

Persistence-based topological optimization: a survey

arXiv.org Machine LearningMar-27-2026

Computational topology provides a tool, persistent homology, to extract quantitative descriptors from structured objects (images, graphs, point clouds, etc). These descriptors can then be involved in optimization problems, typically as a way to incorporate topological priors or to regularize machine learning models. This is usually achieved by minimizing adequate, topologically-informed losses based on these descriptors, which, in turn, naturally raises theoretical and practical questions about the possibility of optimizing such loss functions using gradient-based algorithms. This has been an active research field in the topological data analysis community over the last decade, and various techniques have been developed to enable optimization of persistence-based loss functions with gradient descent schemes. This survey presents the current state of this field, covering its theoretical foundations, the algorithmic aspects, and showcasing practical uses in several applications. It includes a detailed introduction to persistence theory and, as such, aims at being accessible to mathematicians and data scientists newcomers to the field. It is accompanied by an open-source library which implements the different approaches covered in this survey, providing a convenient playground for researchers to get familiar with the field.

artificial intelligence, machine learning, survey article, (18 more...)

2603.24613

Country:

North America > United States (0.14)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
Europe > Germany (0.04)
(3 more...)

Genre: Overview (0.74)

Industry: Health & Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.66)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.65)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Bok, Jinho, Li, Shuangping, Yu, Sophie H.

Detection of local geometry in random graphs: information-theoretic and computational limits

arXiv.org Machine LearningMar-26-2026

We study the problem of detecting local geometry in random graphs. We introduce a model $\mathcal{G}(n, p, d, k)$, where a hidden community of average size $k$ has edges drawn as a random geometric graph on $\mathbb{S}^{d-1}$, while all remaining edges follow the Erdős--Rényi model $\mathcal{G}(n, p)$. The random geometric graph is generated by thresholding inner products of latent vectors on $\mathbb{S}^{d-1}$, with each edge having marginal probability equal to $p$. This implies that $\mathcal{G}(n, p, d, k)$ and $\mathcal{G}(n, p)$ are indistinguishable at the level of the marginals, and the signal lies entirely in the edge dependencies induced by the local geometry. We investigate both the information-theoretic and computational limits of detection. On the information-theoretic side, our upper bounds follow from three tests based on signed triangle counts: a global test, a scan test, and a constrained scan test; our lower bounds follow from two complementary methods: truncated second moment via Wishart--GOE comparison, and tensorization of KL divergence. These results together settle the detection threshold at $d = \widetildeΘ(k^2 \vee k^6/n^3)$ for fixed $p$, and extend the state-of-the-art bounds from the full model (i.e., $k = n$) for vanishing $p$. On the computational side, we identify a computational--statistical gap and provide evidence via the low-degree polynomial framework, as well as the suboptimality of signed cycle counts of length $\ell \geq 4$.

artificial intelligence, inequality, machine learning, (19 more...)

2603.24545

Country:

North America > United States > Pennsylvania (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Hungary > Hajdú-Bihar County > Debrecen (0.04)

Genre:

Research Report (0.63)
Overview (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.61)

Kévin Degraux, Gabriel Peyré, Jalal Fadili, Laurent Jacques

Sparse Support Recovery with Non-smooth Loss Functions

Neural Information Processing SystemsMar-23-2026, 09:56:35 GMT

In this paper, we study the support recovery guarantees of underdetermined sparse regression using the `1-norm as a regularizer and a non-smooth loss function for data fidelity. More precisely, we focus in detail on the cases of `1 and ` losses, and contrast them with the usual `2 loss. While these losses are routinely used to account for either sparse (`1 loss) or uniform (` loss) noise models, a theoretical analysis of their performance is still lacking. In this article, we extend the existing theory from the smooth `2 case to these non-smooth cases. We derive a sharp condition which ensures that the support of the vector to recover is stable to small additive noise in the observations, as long as the loss constraint size is tuned proportionally to the noise level. A distinctive feature of our theory is that it also explains what happens when the support is unstable. While the support is not stable anymore, we identify an "extended support" and show that this extended support is stable to small additive noise. To exemplify the usefulness of our theory, we give a detailed numerical analysis of the support stability/instability of compressed sensing recovery with these different losses. This highlights different parameter regimes, ranging from total support stability to progressively increasing support instability.

artificial intelligence, machine learning, survey article, (17 more...)

Country:

Europe > France (0.14)
Europe > Belgium (0.14)

Genre: Overview (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.46)

Feng Nan, Joseph Wang, Venkatesh Saligrama

Pruning Random Forests for Prediction on a Budget

Neural Information Processing SystemsMar-23-2026, 06:52:52 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, feature cost, machine learning, (20 more...)

Genre: Overview (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.85)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.65)

Balamurugan Palaniappan, Francis Bach

Stochastic Variance Reduction Methods for Saddle-Point Problems

Neural Information Processing SystemsMar-23-2026, 03:04:08 GMT

We consider convex-concave saddle-point problems where the objective functions may be split in many components, and extend recent stochastic variance reduction methods (such as SVRG or SAGA) to provide the first large-scale linearly convergent algorithms for this class of problems which are common in machine learning. While the algorithmic extension is straightforward, it comes with challenges and opportunities: (a) the convex minimization analysis does not apply and we use the notion of monotone operators to prove convergence, showing in particular that the same algorithm applies to a larger class of problems, such as variational inequalities, (b) there are two notions of splits, in terms of functions, or in terms of partial derivatives, (c) the split does need to be done with convex-concave terms, (d) non-uniform sampling is key to an efficient algorithm, both in theory and practice, and (e) these incremental algorithms can be easily accelerated using a simple extension of the "catalyst" framework, leading to an algorithm which is always superior to accelerated batch algorithms.

artificial intelligence, machine learning, survey article, (17 more...)

Genre: Overview (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Machine LearningMar-23-2026

Deep Autocorrelation Modeling for Time-Series Forecasting: Progress and Prospects

Wang, Hao, Pan, Licheng, Wen, Qingsong, Yu, Jialin, Chen, Zhichao, Zheng, Chunyuan, Li, Xiaoxi, Chu, Zhixuan, Xu, Chao, Gong, Mingming, Li, Haoxuan, Lu, Yuan, Lin, Zhouchen, Torr, Philip, Liu, Yan

Autocorrelation is a defining characteristic of time-series data, where each observation is statistically dependent on its predecessors. In the context of deep time-series forecasting, autocorrelation arises in both the input history and the label sequences, presenting two central research challenges: (1) designing neural architectures that model autocorrelation in history sequences, and (2) devising learning objectives that model autocorrelation in label sequences. Recent studies have made strides in tackling these challenges, but a systematic survey examining both aspects remains lacking. To bridge this gap, this paper provides a comprehensive review of deep time-series forecasting from the perspective of autocorrelation modeling. In contrast to existing surveys, this work makes two distinctive contributions. First, it proposes a novel taxonomy that encompasses recent literature on both model architectures and learning objectives -- whereas prior surveys neglect or inadequately discuss the latter aspect. Second, it offers a thorough analysis of the motivations, insights, and progression of the surveyed literature from a unified, autocorrelation-centric perspective, providing a holistic overview of the evolution of deep time-series forecasting. The full list of papers and resources is available at https://github.com/Master-PLC/Awesome-TSF-Papers.

forecasting, large language model, machine learning, (18 more...)

2603.19899

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Asia > China > Zhejiang Province > Hangzhou (0.04)
North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.04)
North America > United States > California > San Diego County > San Diego (0.04)

Genre:

Research Report (1.00)
Overview (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)

Neural Information Processing SystemsMar-21-2026, 18:45:13 GMT

A-FedPD: Aligning Dual-Drift is All Federated Primal-Dual Learning Needs

As a popular paradigm for juggling data privacy and collaborative training, federated learning (FL) is flourishing to distributively process the large scale of heterogeneous datasets on edged clients. Due to bandwidth limitations and security considerations, it ingeniously splits the original problem into multiple subproblems to be solved in parallel, which empowers primal dual solutions to great application values in FL. In this paper, we review the recent development of classical federated primal dual methods and point out a serious common defect of such methods in non-convex scenarios, which we say is a ``dual drift'' caused by dual hysteresis of those longstanding inactive clients under partial participation training. To further address this problem, we propose a novel Aligned Federated Primal Dual (A-FedPD) method, which constructs virtual dual updates to align global consensus and local dual variables for those protracted unparticipated local clients. Meanwhile, we provide a comprehensive analysis of the optimization and generalization efficiency for the A-FedPD method on smooth non-convex objectives, which confirms its high efficiency and practicality. Extensive experiments are conducted on several classical FL setups to validate the effectiveness of our proposed method.

artificial intelligence, proceedings, survey article, (4 more...)

Genre: Overview (0.60)

Industry: Information Technology > Security & Privacy (0.60)

Technology:

Information Technology > Security & Privacy (0.60)
Information Technology > Artificial Intelligence > Machine Learning (0.40)

Neural Information Processing SystemsMar-21-2026, 01:00:18 GMT

EGODE: An Event-attended Graph ODE Framework for Modeling Rigid Dynamics

This paper studies the problem of rigid dynamics modeling, which has a wide range of applications in robotics, graphics, and mechanical design. The problem is partly solved by graph neural network (GNN) simulators. However, these approaches cannot effectively handle the relationship between intrinsic continuity and instantaneous changes in rigid dynamics.

artificial intelligence, machine learning, proceedings, (10 more...)

Genre:

Research Report (0.40)
Overview (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.40)

Neural Information Processing SystemsMar-20-2026, 21:19:28 GMT

The State of Data Curation at NeurIPS: An Assessment of Dataset Development Practices in the Datasets and Benchmarks Track

Data curation is a field with origins in librarianship and archives, whose scholarship and thinking on data issues go back centuries, if not millennia. The field of machine learning is increasingly observing the importance of data curation to the advancement of both applications and fundamental understanding of machine learning models -- evidenced not least by the creation of the Datasets and Benchmarks track itself. This work provides an analysis of recent dataset development practices at NeurIPS through the lens of data curation. We present an evaluation framework for dataset documentation, consisting of a rubric and toolkit developed through a thorough literature review of data curation principles. We use the framework to systematically assess the strengths and weaknesses in current dataset development practices of 60 datasets published in the NeurIPS Datasets and Benchmarks track from 2021-2023.

artificial intelligence, machine learning, proceedings, (8 more...)

Genre: Overview (0.38)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.97)