AITopics

In this work, we analyze alternative effective sample size (ESS) metrics for importance sampling algorithms, and discuss a possible extended range of applications. We show the relationship between the ESS expressions used in the literature and two entropy families, the Rényi and Tsallis entropy. The Rényi entropy is connected to the Huggins-Roy's ESS family introduced in \cite{Huggins15}. We prove that that all the ESS functions included in the Huggins-Roy's family fulfill all the desirable theoretical conditions. We analyzed and remark the connections with several other fields, such as the Hill numbers introduced in ecology, the Gini inequality coefficient employed in economics, and the Gini impurity index used mainly in machine learning, to name a few. Finally, by numerical simulations, we study the performance of different ESS expressions contained in the previous ESS families in terms of approximation of the theoretical ESS definition, and show the application of ESS formulas in a variable selection problem.

artificial intelligence, ess-h, machine learning, (18 more...)

doi: 10.1007/s00180-025-01665-8

2602.22954

Country:

Europe > Italy (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
Europe > United Kingdom > England (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Sharp Convergence Rates for Masked Diffusion Models

Liang, Yuchen, Tan, Zhiheng, Shroff, Ness, Liang, Yingbin

Discrete diffusion models have achieved strong empirical performance in text and other symbolic domains, with masked (absorbing-rate) variants emerging as competitive alternatives to autoregressive models. Among existing samplers, the Euler method remains the standard choice in many applications, and more recently, the First-Hitting Sampler (FHS) has shown considerable promise for masked diffusion models. Despite their practical success, the theoretical understanding of these samplers remains limited. Existing analyses are conducted in Kullback-Leibler (KL) divergence, which often yields loose parameter dependencies and requires strong assumptions on score estimation. Moreover, these guarantees do not cover recently developed high-performance sampler of FHS. In this work, we first develop a direct total-variation (TV) based analysis for the Euler method that overcomes these limitations. Our results relax assumptions on score estimation, improve parameter dependencies, and establish convergence guarantees without requiring any surrogate initialization. Also for this setting, we provide the first convergence lower bound for the Euler sampler, establishing tightness with respect to both the data dimension $d$ and the target accuracy $\varepsilon$. Finally, we analyze the FHS sampler and show that it incurs no sampling error beyond that induced by score estimation, which we show to be tight with a matching lower error bound. Overall, our analysis introduces a direct TV-based error decomposition along the CTMC trajectory and a decoupling-based path-wise analysis for FHS, which may be of independent interest.

artificial intelligence, diffusion model, machine learning, (17 more...)

2602.22505

Country:

North America > United States > Ohio (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.34)

Industry: Government (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

A 1/R Law for Kurtosis Contrast in Balanced Mixtures

Bi, Yuda, Xiao, Wenjun, Bai, Linhao, Calhoun, Vince D

Abstract--Kurtosis-based Independent Component Analysis (ICA) weakens in wide, balanced mixtures. We also show that purification--selecting m R sign-consistent sources--restores R-independent contrast Ω(1/m), with a simple data-driven heuristic. Synthetic experiments validate the predicted decay, the T crossover, and contrast recovery. Independent Component Analysis (ICA) recovers statistically independent latent sources from linear mixtures and is identifiable whenever at most one source is Gaussian [1]. Excess kurtosis--the standardized fourth cumulant--is a central contrast function [9], and kurtosis-type nonlinearities remain standard in FastICA.

artificial intelligence, independent component analysis, machine learning, (15 more...)

2602.22334

Country:

North America > United States > Georgia > Fulton County > Atlanta (0.05)
North America > United States > District of Columbia > Washington (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre: Research Report (0.50)

Industry:

Health & Medicine > Health Care Technology (0.70)
Health & Medicine > Therapeutic Area > Neurology (0.70)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.94)

Wang, Ruixiao, Chen, Xiaohong, Chewi, Sinho

Sampling from Constrained Gibbs Measures: with Applications to High-Dimensional Bayesian Inference

This paper considers a non-standard problem of generating samples from a low-temperature Gibbs distribution with \emph{constrained} support, when some of the coordinates of the mode lie on the boundary. These coordinates are referred to as the non-regular part of the model. We show that in a ``pre-asymptotic'' regime in which the limiting Laplace approximation is not yet valid, the low-temperature Gibbs distribution concentrates on a neighborhood of its mode. Within this region, the distribution is a bounded perturbation of a product measure: a strongly log-concave distribution in the regular part and a one-dimensional exponential-type distribution in each coordinate of the non-regular part. Leveraging this structure, we provide a non-asymptotic sampling guarantee by analyzing the spectral gap of Langevin dynamics. Key examples of low-temperature Gibbs distributions include Bayesian posteriors, and we demonstrate our results on three canonical examples: a high-dimensional logistic regression model, a Poisson linear model, and a Gaussian mixture model.

artificial intelligence, assumption 4, machine learning, (16 more...)

2602.22369

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > New Jersey > Bergen County > Hackensack (0.04)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.45)

Differentially Private Truncation of Unbounded Data via Public Second Moments

Cao, Zilong, Bi, Xuan, Zhang, Hai

Data privacy is important in the AI era, and differential privacy (DP) is one of the golden solutions. However, DP is typically applicable only if data have a bounded underlying distribution. We address this limitation by leveraging second-moment information from a small amount of public data. We propose Public-moment-guided Truncation (PMT), which transforms private data using the public second-moment matrix and applies a principled truncation whose radius depends only on non-private quantities: data dimension and sample size. This transformation yields a well-conditioned second-moment matrix, enabling its inversion with a significantly strengthened ability to resist the DP noise. Furthermore, we demonstrate the applicability of PMT by using penalized and generalized linear regressions. Specifically, we design new loss functions and algorithms, ensuring that solutions in the transformed space can be mapped back to the original domain. We have established improvements in the models' DP estimation through theoretical error bounds, robustness guarantees, and convergence results, attributing the gains to the conditioning effect of PMT. Experiments on synthetic and real datasets confirm that PMT substantially improves the accuracy and stability of DP models.

artificial intelligence, machine learning, matrix, (17 more...)

2602.22282

Country:

North America > United States > Minnesota (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.47)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.49)

Learning Credal Ensembles via Distributionally Robust Optimization

Wang, Kaizheng, Faza, Ghifari Adam, Cuzzolin, Fabio, Chau, Siu Lun, Moens, David, Hallez, Hans

Credal predictors are models that are aware of epistemic uncertainty and produce a convex set of probabilistic predictions. They offer a principled way to quantify predictive epistemic uncertainty (EU) and have been shown to improve model robustness in various settings. However, most state-of-the-art methods mainly define EU as disagreement caused by random training initializations, which mostly reflects sensitivity to optimization randomness rather than uncertainty from deeper sources. To address this, we define EU as disagreement among models trained with varying relaxations of the i.i.d. assumption between training and test data. Based on this idea, we propose CreDRO, which learns an ensemble of plausible models through distributionally robust optimization. As a result, CreDRO captures EU not only from training randomness but also from meaningful disagreement due to potential distribution shifts between training and test data. Empirical results show that CreDRO consistently outperforms existing credal methods on tasks such as out-of-distribution detection across multiple benchmarks and selective classification in medical applications.

artificial intelligence, bayesian inference, machine learning, (18 more...)

2602.0847

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > Belgium > Flanders > Flemish Brabant > Leuven (0.04)
Europe > Netherlands (0.04)
Asia > Singapore (0.04)

Genre:

Research Report > New Finding (0.48)
Research Report > Promising Solution (0.34)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Sensing and Signal Processing (0.93)
Information Technology > Data Science (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
(2 more...)

Santos, Vagner, Coscrato, Victor, Cabezas, Luben, Izbicki, Rafael, Ramos, Thiago

LoBoost: Fast Model-Native Local Conformal Prediction for Gradient-Boosted Trees

Gradient-boosted decision trees are among the strongest off-the-shelf predictors for tabular regression, but point predictions alone do not quantify uncertainty. Conformal prediction provides distribution-free marginal coverage, yet split conformal uses a single global residual quantile and can be poorly adaptive under heteroscedasticity. Methods that improve adaptivity typically fit auxiliary nuisance models or introduce additional data splits/partitions to learn the conformal score, increasing cost and reducing data efficiency. We propose LoBoost, a model-native local conformal method that reuses the fitted ensemble's leaf structure to define multiscale calibration groups. Each input is encoded by its sequence of visited leaves; at resolution level k, we group points by matching prefixes of leaf indices across the first k trees and calibrate residual quantiles within each group. LoBoost requires no retraining, auxiliary models, or extra splitting beyond the standard train/calibration split. Experiments show competitive interval quality, improved test MSE on most datasets, and large calibration speedups.

artificial intelligence, machine learning, prediction, (19 more...)

2602.22432

Country:

South America > Brazil > São Paulo (0.04)
Europe > France > Auvergne-Rhône-Alpes > Isère > Grenoble (0.04)
North America > United States > Massachusetts (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Carpentier, Alexandra, Verzelen, Nicolas

Low-degree Lower bounds for clustering in moderate dimension

We study the fundamental problem of clustering $n$ points into $K$ groups drawn from a mixture of isotropic Gaussians in $\mathbb{R}^d$. Specifically, we investigate the requisite minimal distance $Δ$ between mean vectors to partially recover the underlying partition. While the minimax-optimal threshold for $Δ$ is well-established, a significant gap exists between this information-theoretic limit and the performance of known polynomial-time procedures. Although this gap was recently characterized in the high-dimensional regime ($n \leq dK$), it remains largely unexplored in the moderate-dimensional regime ($n \geq dK$). In this manuscript, we address this regime by establishing a new low-degree polynomial lower bound for the moderate-dimensional case when $d \geq K$. We show that while the difficulty of clustering for $n \leq dK$ is primarily driven by dimension reduction and spectral methods, the moderate-dimensional regime involves more delicate phenomena leading to a "non-parametric rate". We provide a novel non-spectral algorithm matching this rate, shedding new light on the computational limits of the clustering problem in moderate dimension.

artificial intelligence, machine learning, node, (17 more...)

2602.23023

Country:

North America > United States (0.14)
Europe > France > Occitanie > Hérault > Montpellier (0.04)
Europe > Italy > Lazio > Rome (0.04)
(3 more...)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.65)

Jawaid, Arsalan, Karatas, Abdullah, Seewig, Jörg

Regular Fourier Features for Nonstationary Gaussian Processes

Simulating a Gaussian process requires sampling from a high-dimensional Gaussian distribution, which scales cubically with the number of sample locations. Spectral methods address this challenge by exploiting the Fourier representation, treating the spectral density as a probability distribution for Monte Carlo approximation. Although this probabilistic interpretation works for stationary processes, it is overly restrictive for the nonstationary case, where spectral densities are generally not probability measures. We propose regular Fourier features for harmonizable processes that avoid this limitation. Our method discretizes the spectral representation directly, preserving the correlation structure among spectral weights without requiring probability assumptions. Under a finite spectral support assumption, this yields an efficient low-rank approximation that is positive semi-definite by construction. When the spectral density is unknown, the framework extends naturally to kernel learning from data. We demonstrate the method on locally stationary kernels and on harmonizable mixture kernels with complex-valued spectral densities.

artificial intelligence, machine learning, spectral density, (19 more...)

2602.23006

Country:

Europe > Germany > Rhineland-Palatinate > Kaiserslautern (0.04)
North America > United States (0.04)
Europe > Denmark > Capital Region > Copenhagen (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Modeling & Simulation (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Daily Mail - Science & techFeb-26-2026, 22:39:05 GMT

Revealed: Unexplained objects that stop and accelerate quickly in space detected by 'highly qualified observers, says former UFO chief. 'Spacecraft we know don't behave that way'

Kentucky mother and daughter turn down $26.5MILLION to sell their farms to secretive tech giant that wants to build data center there Horrifying next twist in the Alexander brothers case: MAUREEN CALLAHAN exposes an unthinkable perversion that's been hiding in plain sight Hollywood icon who starred in Psycho after Hitchcock dubbed her'my new Grace Kelly' looks incredible at 95 Kylie Jenner's total humiliation in Hollywood: Derogatory rumor leaves her boyfriend's peers'laughing at her' behind her back Tucker Carlson erupts at Trump adviser as she hurls'SLANDER' claim linking him to synagogue shooting Ben Affleck'scores $600m deal' with Netflix to sell his AI film start-up Long hair over 45 is ageing and try-hard. I've finally cut mine off. Alexander brothers' alleged HIGH SCHOOL rape video: Classmates speak out on sickening footage... as creepy unseen photos are exposed Heartbreaking video shows very elderly DoorDash driver shuffle down customer's driveway with coffee order because he is too poor to retire Amber Valletta, 52, was a '90s Vogue model who made movies with Sandra Bullock and Kate Hudson, see her now Model Cindy Crawford, 60, mocked for her'out of touch' morning routine: 'Nothing about this is normal' Revealed: Unexplained objects that stop and accelerate quickly in space detected by'highly qualified observers, says former UFO chief. 'Spacecraft we know don't behave that way' Sign up for our US Editor's Picks newsletter to get all the best exclusive stories The Pentagon's UFO office former chief has revealed unexplained objects were detected in space - and that some performed maneuvers defying anything in America's known aerospace arsenal. Lieutenant Colonel Tim Phillips, who was acting director of the All-domain Anomaly Resolution Office (AARO) until last April, told the Daily Mail that while most cases involved objects in the air, some detections extended beyond the atmosphere.

artificial intelligence, phillips, social media, (14 more...)

Daily Mail - Science & tech

Country:

North America > United States > Kentucky (0.24)
Europe > Middle East > Malta > Port Region > Southern Harbour District > Valletta (0.24)
North America > Canada > Alberta (0.14)
(16 more...)

Genre: Personal > Obituary (0.46)

Industry:

Media > Television (1.00)
Media > Music (1.00)
Media > Film (1.00)
(7 more...)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Communications > Mobile (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (0.68)