AITopics

Minimum Spanning Trees have been used in unsupervised learning, particularly in clustering tasks, due to their ability to recognize clusters by removing edges that are considered inconsistent in defining those clusters. This paper aims to study the use of Minimum Spanning Trees in supervised learning. Specifically, we propose a classification algorithm based on Minimum Spanning Trees. To improve its performance, we introduce a robust version of the method that is also computationally more efficient. We evaluate the effectiveness of our proposed method through an extensive simulation study. We also apply the proposed methodology to a real-world case study involving aircraft trajectories.

artificial intelligence, machine learning, mst-class, (15 more...)

2606.21639

Country:

Europe > Austria (0.28)
Europe > Spain > Galicia (0.14)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Santos-Pascual, M., Insua, D. Ríos

Adversarial observations in probabilistic State-Space Models for robust Reinforcement Learning

Machine learning (ML) systems increasingly support decision-making in high-stakes settings such as robotics, autonomous systems, finance, homeland security, and critical infrastructure protection. In these domains, robustness and reliability are essential because failures can translate into physical harm, financial loss, or operational breakdown (García and Fernández, 2015). A recurring weakness is that many ML pipelines implicitly assume that training and deployment data are independent and identically distributed (i.i.d.), even though real deployments often violate this assumption through sensor drift, changing environments, and distribution shift (Quiñonero-Candela et al., 2009). In security-relevant contexts, this problem is amplified because adversaries can deliberately manipulate observations, rewards, or the environment to induce targeted shifts and drive the system toward failure (Barreno et al., 2006; Biggio and Roli, 2018; Vassilev et al., 2024). These concerns motivate the relatively recent field of adversarial machine learning (AML), which studies how malicious perturbations can break learning systems and how to design defenses against them (Biggio and Roli, 2018; Goodfellow, Shlens and Szegedy, 2015).

artificial intelligence, machine learning, reinforcement learning, (17 more...)

2606.2088

Country:

North America > United States (0.28)
Europe > United Kingdom > England (0.28)

Genre: Research Report (0.50)

Industry:

Information Technology > Security & Privacy (0.93)
Government (0.86)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
(2 more...)

Vayness, Eyal, Sangnier, Maxime

Subsampling for supervised learning in reproducing kernel Hilbert spaces

In the era of big data, subsampling became a common practice in statistical learning. By selecting a subgroup of individuals based on which the learner is trained, subsampling aims at reducing the computational cost and time of the estimation step, and ideally leads to a decrease of its energy consumption and carbon footprint. This work focuses on a nonparametric setting, in which the hypotheses set lies in a reproducing kernel Hilbert space, and the estimator is a minimizer of an empirical risk reweighted à la Horvitz-Thompson. By studying the asymptotic properties of this estimator, we reveal an optimal subsampling scheme (regarding the trace of the covariance operator) and show that it can be used via plug-in. A numerical study on synthetic and real-world datasets shows the practicability and the benefit of the proposed approach.

artificial intelligence, def, machine learning, (20 more...)

2606.2126

Country: Europe (0.45)

Genre: Research Report > New Finding (0.46)

Industry: Energy (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.92)

Agapiou, Sergios, Castillo, Ismaël, Egels, Paul

Leveraging tails for adaptation

A central goal in nonparametric statistics is adaptation: the ability of an estimator to perform simultaneously and optimally across a wide variety of settings with little to no tuning. When inference is carried out over a class of functional spaces, it is desirable that the estimator automatically adapts to unknown features of these spaces, such as smoothness, geometry, sparsity or other finer structural properties. A large body of literature has focused on adaptation: Lepski's method Lepski ı [1990, 1991], thresholding Donoho et al. [1995] and model selection Barron et al. [1999] are amongst the most well-known nonBayesian approaches. Bayesian methods, on the other hand, have a natural ability to achieve adaptation, as we discuss in more detail below, by choosing prior distributions that are flexible enough to achieve this task (one possibility is for instance to draw certain prior parameters at random in a hierarchical Bayes fashion). Recently, motivated by the remarkable empirical success of deep learning methods, there has been a growing interest in understanding how neural networks can automatically learn structural parameters, such as smoothness of functions or'effective' dimensions, for instance in regression settings exhibiting a compositional structure as in Schmidt-Hieber [2020], Kohler and Langer [2021] or for data lying on geometric structures (e.g.

artificial intelligence, bayesian inference, machine learning, (19 more...)

2606.2048

Country: Europe (0.67)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.65)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)

Berthier, Raphaël, Pillaud-Vivien, Loucas

Incremental Learning in Mirror Flows

Neural networks trained with gradient descent often learn solutions of increasing complexity: the model first captures simple structure, then progressively incorporates finer details [AJB+17, KKN+19, ZSL25]. This incremental learning phenomenon, often visible as plateaus in the training loss separated by rapid transitions, has attracted significant theoretical attention. The most detailed analyses of incremental learning have been carried out for diagonal linear networks, including precise descriptions of transition times and plateau levels [Ber23, PF23]. This level of detail is possible because the training dynamics of these networks reduce to a mirror flow [WGL+20]. Mirror flows themselves feature incremental learning when initialized near the boundary of the domain of the mirror potential. This paper gives a rigorous description of this phenomenon for a broad class of mirror flows, thereby generalizing the previous analyses of diagonal linear networks.

artificial intelligence, machine learning, mirror flow, (14 more...)

2606.23198

Country: Europe > France (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.34)

Krahn, Maximilian, Bastian, Lennart, Garg, Vikas, Schuller, Björn, Birdal, Tolga

Collapsed Effective Operators for Higher-order Structures

Higher-order structures are powerful relational modeling tools, yet existing spectral operators decompose the topology into separate ranks, leaving practitioners to fuse the information back to vertices through ad hoc choices. We introduce Collapsed Effective Operators, which condense higher-order degrees of freedom into a single vertex-level operator via Schur complementation of a graded Laplacian. This yields a (generally dense) operator that encodes long-range interactions mediated by topology and is applicable to arbitrary higher-order constructs. We show it preserves positive semi-definiteness with a spectral upper bound relative to the rank-0 Hodge Laplacian, effectively lowering system energy under higher-order connectivity. Empirically, our operator improves spectral clustering, signal smoothing, and enables the inclusion of topological features in neural network architectures via positional encoding. The project page can be found http://circle-group.github.io/research/CollapsedEffectiveOperators

artificial intelligence, laplacian, machine learning, (16 more...)

2606.23517

Country:

Europe (0.93)
North America > United States (0.28)

Genre: Research Report (0.81)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Barone, Rosario, Valle, Luciana Dalla, Leisen, Fabrizio, Villa, Cristiano

Bayesian model selection of vine copulas: a loss-based perspective

The growing popularity of vine copulas in multivariate statistical analysis is largely driven by their ability to capture complex dependence structures. However, this flexibility comes at a cost, as the number of possible vine models grows rapidly and becomes intractable even in moderately low-dimensional settings. These limitations affect the practical applicability of current Bayesian inference and model selection approaches, effectively restricting it to problems of relatively small-dimension due to their high computational cost. This paper addresses the still open challenge of efficient model selection and estimation in Bayesian vine methodology. We propose a novel framework for Bayesian vine copula model selection that combines loss-based model priors with the shotgun stochastic search strategy. The strength of the proposed approach is twofold: it promotes sparsity and enables fast and effective structure selection. Furthermore, our comprehensive framework jointly identifies the vine structure, selects the copula families, and estimates the model parameters. The power of the proposed approach is demonstrated via simulation studies and an application to a real dataset of EFT portfolio asset returns.

artificial intelligence, copula, machine learning, (18 more...)

2606.21512

Country:

North America (0.28)
Europe > Italy (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.82)

Ding, Yuanhao, Li, Meimingwei, Arias, Esteban Garces, Aßenmacher, Matthias, Heumann, Christian, Zhang, Chongsheng

Breaking the Likelihood Trap: Variance-Calibrated Modulation for Large Language Model Decoding

In open-ended generation, LLMs frequently fall into the "likelihood trap", marked by repetitive degeneration and vocabulary dullness, creating a discrepancy between machine-generated and human-written text. While post-hoc tail truncation (e.g., Top-$p$, Min-$p$) avoids sampling from the unreliable tail, it can over-sample from the uncalibrated head and misalign generation with human lexical preferences; fixed scalar repetition penalties likewise ignore variation in logit scale across inference steps, potentially disrupting semantic coherence. To address both limitations, we propose Variance-Calibrated Modulation (VCM), a training-free pre-decoding intervention that reshapes the probability distribution before truncation through two dynamic mechanisms: (1) Contextual Searchlight via PMI, which suppresses global stopwords while elevating context-evoked tokens, and (2) Adaptive Self-Debiasing, which uses real-time logit standard deviation for scale-invariant penalization. Across open-ended generation, factual QA, and mathematical reasoning, VCM consistently mitigates the likelihood trap. With negligible computational overhead, VCM integrates with existing decoding strategies, improving diversity, coherence, and, particularly at higher decoding temperatures, reasoning accuracy.

computational linguistic, large language model, natural language, (18 more...)

2606.22511

Country:

Asia > Middle East > UAE (0.46)
North America > United States (0.46)
Europe > Austria (0.28)
Europe > Germany (0.28)

Genre: Research Report > Experimental Study (0.46)

Industry:

Leisure & Entertainment > Sports (1.00)
Education (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Generalized nonparametric regression in reproducing kernel Hilbert spaces: Consistency and rates of convergence

Kalogridis, Ioannis

We develop a comprehensive theory for regularized M-estimation in reproducing kernel Hilbert spaces. Under mild conditions on the loss we establish existence and measurability of the estimator, covering a wide range of convex and non-convex losses, including bounded robust losses. We further prove sharp rates of convergence with an explicit bias-variance decomposition governed by a novel complexity measure. We show that the variance is independent of misspecification, while the bias depends on a source condition parameter known in the learning literature. For tensor product Sobolev spaces we obtain new rates that connect to spaces of functions with dominating mixed smoothness, substantially extending existing results and explaining why these estimators circumvent the curse of dimensionality. Our methodology, combining elements from both functional analysis and empirical process theory, allows for an asymptotic linearisation of the objective function that avoids both closed-form solutions and global Lipschitz assumptions, and may be of independent interest. The estimators are implemented in C++ and theory is supported by numerical experiments.

artificial intelligence, assumption, machine learning, (14 more...)

2606.22993

Country: Europe > Austria (0.28)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.88)

Szabadváry, Johan Hallberg

Betting on Moments: Legendre Jumper Martingales for Online Exchangeability Testing

We present a family of conformal test martingales based on shifted Legendre polynomials, which extends the Simple Jumper martingale. The Simple Legendre Jumper substitutes the linear betting function with a polynomial of arbitrary degree, thereby facilitating the detection of variance, skewness, and higher-order deviations from uniformity; the standard Simple Jumper is a specific instance of degree one. The Product Legendre Jumper integrates multiple polynomial degrees into a unified betting function, although its state space expands exponentially--a cost we refer to as the jumping tax. To address this issue, we introduce the Variational Legendre Jumper, which factorises the joint adaptation through a mean-field approximation, thereby reducing exponential scaling to linear time with minimal loss in power. Lastly, the Composite Legendre Jumper incorporates several jumping rates, ensuring a wealth floor under exchangeability and automatic adaptation to the shift's timescale. Empirical results from a real-world classification task demonstrate that the combined methods consistently surpass any single-degree martingale under distributional shift, and the composite variant is recommended as the default when the shift timescale is unknown.

artificial intelligence, legendre jumper, machine learning, (19 more...)

2606.20859

Country: Europe (0.28)

Genre: Research Report (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)