AITopics

2605.06609

Genre:

Research Report > New Finding (0.64)
Research Report > Experimental Study (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.70)

arXiv.org Machine LearningMay-7-2026

FL-Sailer: Efficient and Privacy-Preserving Federated Learning for Scalable Single-Cell Epigenetic Data Analysis via Adaptive Sampling

Zhang, Guangyi, Dai, Yi, He, Yiyun, Liu, Junhao

Single-cell ATAC-seq (scATAC-seq) enables high-resolution mapping of chromatin accessibility, yet privacy regulations and data size constraints hinder multi-institutional sharing. Federated learning (FL) offers a privacy-preserving alternative, but faces three fundamental barriers in scATAC-seq analysis: ultra-high dimensionality, extreme sparsity, and severe cross-institutional heterogeneity. We propose FL-Sailer, the first FL framework designed for scATAC-seq data. FL-Sailer integrates two key innovations: (i) adaptive leverage score sampling, which selects biologically interpretable features while reducing dimensionality by 80%, and (ii) an invariant VAE architecture, which disentangles biological signals from technical confounders via mutual information minimization. We provide a convergence guarantee, showing that FL-Sailer converges to an approximate solution of the original high-dimensional problem with bounded error. Extensive experiments on synthetic and real epigenomic datasets demonstrate that FL-Sailer not only enables previously infeasible multi-institutional collaborations but also surpasses centralized methods by leveraging adaptive sampling as an implicit regularizer to suppress technical noise. Our work establishes that federated learning, when tailored to domain-specific challenges, can become a superior paradigm for collaborative epigenomic research.

artificial intelligence, lmarginal, machine learning, (17 more...)

2605.04519

Country: North America > United States > California (0.28)

Genre: Research Report (0.40)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.90)
Information Technology > Data Science (0.86)
Information Technology > Security & Privacy (0.66)

arXiv.org Machine LearningMar-30-2026

On the Expressive Power of Contextual Relations in Transformers

Fraiman, Demián

Transformer architectures have achieved remarkable empirical success in modeling contextual relationships in natural language, yet a precise mathematical characterization of their expressive power remains incomplete. In this work, we introduce a measure-theoretic framework for contextual representations in which texts are modeled as probability measures over a semantic embedding space, and contextual relations between words, are represented as coupling measures between them. Within this setting, we introduce Sinkhorn Transformer, a transformer-like architecture. Our main result is a universal approximation theorem: any continuous coupling function between probability measures, that encodes the semantic relation coupling measure, can be uniformly approximated by a Sinkhorn Transformer with appropriate parameters.

artificial intelligence, machine learning, natural language, (19 more...)

2603.2586

Country: South America > Argentina > Pampas > Buenos Aires F.D. > Buenos Aires (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Jain, Saksham, Luedtke, Alex

Conditional Distributional Treatment Effects: Doubly Robust Estimation and Testing

arXiv.org Machine LearningMar-18-2026

Beyond conditional average treatment effects, treatments may impact the entire outcome distribution in covariate-dependent ways, for example, by altering the variance or tail risks for specific subpopulations. We propose a novel estimand to capture such conditional distributional treatment effects, and develop a doubly robust estimator that is minimax optimal in the local asymptotic sense. Using this, we develop a test for the global homogeneity of conditional potential outcome distributions that accommodates discrepancies beyond the maximum mean discrepancy (MMD), has provably valid type 1 error, and is consistent against fixed alternatives -- the first test, to our knowledge, with such guarantees in this setting. Furthermore, we derive exact closed-form expressions for two natural discrepancies (including the MMD), and provide a computationally efficient, permutation-free algorithm for our test.

artificial intelligence, conditionaldte, machine learning, (19 more...)

2603.16829

Country:

North America > United States (0.04)
Asia > Japan > Honshū > Kantō > Kanagawa Prefecture (0.04)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Neural Information Processing SystemsFeb-19-2026, 06:21:58 GMT

Large-batchOptimizationforDenseVisualPredictions

At thet-th backward propagation step, we can derive the gradient il(wt)toupdatei-th module inM. The number in the bracket represents the batch size. We see that when the batch size is small (i.e.,32), the gradientvariancesaresimilar. N and K indicate the number of FPN levels and region proposals fed into the detection head. To evaluate this assumption, as shown in Figure 1, we have three observations. As illustrated by the second figure in Figure 1, the gradient misalignment phenomenon between detection head and backbone has been reduced.

artificial intelligence, machine learning, wehave, (18 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.95)

Neural Information Processing SystemsFeb-18-2026, 23:35:14 GMT

OntheSaturationEffectsofSpectralAlgorithms inLargeDimensions

Manynon-parametric regression methods areproposed to solve the regression problem by assuming thatf falls into certain function classes, including polynomial splines Stone (1994), local polynomials Cleveland (1979); Stone (1977), the spectral algorithmsCaponnetto(2006);CaponnettoandDeVito(2007);CaponnettoandYao(2010),etc.

algorithm, artificial intelligence, machine learning, (17 more...)

Country:

Asia > China > Beijing > Beijing (0.04)
North America > United States > Massachusetts (0.04)

Genre: Research Report (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Neural Information Processing SystemsFeb-18-2026, 20:36:51 GMT

OptimalHypothesisSelectionin(Almost) LinearTime

This problem involves identifying a density function that accurately represents the distribution of a given dataset.

artificial intelligence, hypothesis, machine learning, (18 more...)

Country:

North America > United States > Colorado > Denver County > Denver (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)
North America > United States > Arizona > Maricopa County > Phoenix (0.04)
(4 more...)

Genre: Research Report (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.93)

Neural Information Processing SystemsFeb-16-2026, 19:26:50 GMT

OptimalRatesforVector-ValuedSpectral RegularizationLearningAlgorithms

We study theoretical properties of a broad class of regularized algorithms with vector-valued output.

algorithm, artificial intelligence, machine learning, (18 more...)

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Romania > Sud-Est Development Region > Constanța County > Constanța (0.04)

Genre: Research Report (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Neural Information Processing SystemsFeb-15-2026, 11:30:17 GMT

6454dcd80b5373daaa97e53ce32c78a1-Paper-Conference.pdf

Wepropose twoinnovativealgorithms, DP-GLMtron and DP-TAGLMtron, that outperform the conventional DPSGD. Inlight ofthevast quantities of personal and sensitiveinformation involved, traditional methods of ensuring privacy are encountering significant challenges.

artificial intelligence, machine learning, wt 1, (17 more...)