AITopics | Asia

Collaborating Authors

Asia

Momentum Further Constrains Sharpness at the Edge of Stochastic Stability

Andreyev, Arseniy, Ananthkumar, Advikar, Walden, Marc, Poggio, Tomaso, Beneventano, Pierfrancesco

arXiv.org Machine LearningApr-16-2026

Recent work suggests that (stochastic) gradient descent self-organizes near an instability boundary, shaping both optimization and the solutions found. Momentum and mini-batch gradients are widely used in practical deep learning optimization, but it remains unclear whether they operate in a comparable regime of instability. We demonstrate that SGD with momentum exhibits an Edge of Stochastic Stability (EoSS)-like regime with batch-size-dependent behavior that cannot be explained by a single momentum-adjusted stability threshold. Batch Sharpness (the expected directional mini-batch curvature) stabilizes in two distinct regimes: at small batch sizes it converges to a lower plateau $2(1-β)/η$, reflecting amplification of stochastic fluctuations by momentum and favoring flatter regions than vanilla SGD; at large batch sizes it converges to a higher plateau $2(1+β)/η$, where momentum recovers its classical stabilizing effect and favors sharper regions consistent with full-batch dynamics. We further show that this aligns with linear stability thresholds and discuss the implications for hyperparameter tuning and coupling.

artificial intelligence, machine learning, regime, (17 more...)

arXiv.org Machine Learning

2604.14108

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
Asia > Middle East > Jordan (0.04)
North America > United States > New Jersey > Mercer County > Princeton (0.04)

Genre: Research Report (0.85)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)

Add feedback

Identifiability of Potentially Degenerate Gaussian Mixture Models With Piecewise Affine Mixing

Xu, Danru, Lachapelle, Sébastien, Magliacane, Sara

arXiv.org Machine LearningApr-16-2026

Causal representation learning (CRL) aims to identify the underlying latent variables from high-dimensional observations, even when variables are dependent with each other. We study this problem for latent variables that follow a potentially degenerate Gaussian mixture distribution and that are only observed through the transformation via a piecewise affine mixing function. We provide a series of progressively stronger identifiability results for this challenging setting in which the probability density functions are ill-defined because of the potential degeneracy. For identifiability up to permutation and scaling, we leverage a sparsity regularization on the learned representation. Based on our theoretical results, we propose a two-stage method to estimate the latent variables by enforcing sparsity and Gaussianity in the learned representations. Experiments on synthetic and image data highlight our method's effectiveness in recovering the ground-truth latent variables.

artificial intelligence, identifiability, machine learning, (15 more...)

arXiv.org Machine Learning

2604.13218

Country:

Asia > Japan > Honshū > Tōhoku > Iwate Prefecture > Morioka (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)
North America > Canada > Quebec > Montreal (0.04)
(2 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.92)

Add feedback

A short proof of near-linear convergence of adaptive gradient descent under fourth-order growth and convexity

Davis, Damek, Drusvyatskiy, Dmitriy

arXiv.org Machine LearningApr-16-2026

Davis, Drusvyatskiy, and Jiang showed that gradient descent with an adaptive stepsize converges locally at a nearly-linear rate for smooth functions that grow at least quartically away from their minimizers. The argument is intricate, relying on monitoring the performance of the algorithm relative to a certain manifold of slow growth -- called the ravine. In this work, we provide a direct Lyapunov-based argument that bypasses these difficulties when the objective is in addition convex and a has a unique minimizer. As a byproduct of the argument, we obtain a more adaptive variant than the original algorithm with encouraging numerical performance.

artificial intelligence, gradient descent, machine learning, (16 more...)

arXiv.org Machine Learning

2604.13393

Country:

North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.14)
North America > United States > New York (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
(3 more...)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.62)

Add feedback

Grayson Perry Has Seen the Future review – some of these insights into AI are just mindblowing

The GuardianApr-15-2026, 21:00:38 GMT

Intelligent, egoless the artist in Grayson Perry Has Seen the Future. Intelligent, egoless the artist in Grayson Perry Has Seen the Future. From people marrying digital companions to CEOs excited about how people whose jobs are replaced can'adapt', this is terrifying watching. T here is a fun game you can play while watching Grayson Perry Has Seen the Future, the two-part documentary presented by the artist on the subject of artificial intelligence, its uses and its possible ramifications. Gather a group of friends, press play, and see which of you loses your mind first.

artificial intelligence, grayson perry, social media, (8 more...)

The Guardian

Country:

Asia > Southeast Asia (0.06)
Oceania > Australia (0.05)
North America > United States > California > San Francisco County > San Francisco (0.05)
Europe > Ukraine (0.05)

Industry: Leisure & Entertainment > Sports (0.72)

Technology:

Information Technology > Artificial Intelligence (1.00)
Information Technology > Communications > Social Media (0.74)

Add feedback

Visit a WWII destroyer without leaving your sofa

The USS Cassin Young is one of the last of the war's Fletcher-class destroyers. More information Adding us as a Preferred Source in Google by using this link indicates that you would like to see more of our content in Google News results. The USS Cassin Young is one of four remaining Fletcher-class destroyers still afloat. Breakthroughs, discoveries, and DIY tips sent six days a week. Although its name may not sound immediately familiar, the over 360-foot-long ship's recognizable silhouette remains a hallmark example of World War II imagery.

artificial intelligence, physics popular science video space, uss cassin young, (8 more...)

Popular Science

Country:

Oceania > Palau (0.05)
Oceania > Guam (0.05)
North America > United States > Massachusetts (0.05)
(4 more...)

Industry:

Government > Military > Navy (1.00)
Government > Regional Government > North America Government > United States Government (0.30)

Technology: Information Technology > Artificial Intelligence (0.71)

Add feedback

Monkeys walk around a virtual world using only their thoughts

New ScientistApr-15-2026, 18:00:42 GMT

Researchers hope the experiments will pave the way for people with paralysis to explore virtual worlds or more intuitively control electric wheelchairs in this one. Peter Janssen at KU Leuven in Belgium and colleagues implanted three rhesus macaque ( Macaca mulatta) monkeys with BCIs. Crucially, each animal got three implants, each consisting of 96 electrodes, positioned in the primary motor, dorsal and ventral premotor cortex. The first area is commonly used in BCI research and relates to physical movement, but the latter two are thought to be involved in planning movement in a higher, more abstract way. Electrical signals from the implants were then interpreted by an AI model and used to control VR avatars as the monkeys watched a 3D monitor.

artificial intelligence, close advertisement skip, social media, (16 more...)

New Scientist

Country:

Europe > Belgium > Flanders > Flemish Brabant > Leuven (0.25)
Europe > Ukraine > Kyiv Oblast > Chernobyl (0.06)
Asia > Middle East > Iran (0.05)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.31)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Human Computer Interaction > Interfaces (0.78)
Information Technology > Artificial Intelligence (0.72)

Add feedback

50,000 illegal shark fins found inside fake car part boxes

The poached ingredients worth $1.3 million were seized in a nationwide hunt. More information Adding us as a Preferred Source in Google by using this link indicates that you would like to see more of our content in Google News results. Shark fins remain a prized delicacy despite conservation efforts and education. Breakthroughs, discoveries, and DIY tips sent six days a week. The United States Fish and Wildlife Service (FWS) recently exposed a major international smuggling operation orchestrated across at least three cities around the country.

andrew paul, artificial intelligence, physics popular science video space, (8 more...)

Popular Science

Country:

North America > United States > Maryland (0.06)
North America > United States > Ohio > Hamilton County > Cincinnati (0.05)
North America > United States > Kentucky > Jefferson County > Louisville (0.05)
(6 more...)

Industry:

Government > Regional Government > North America Government > United States Government (0.35)
Media > Photography (0.33)

Technology: Information Technology > Artificial Intelligence (0.36)

Add feedback

Discrete Flow Maps

Potaptchik, Peter, Yim, Jason, Saravanan, Adhi, Holderrieth, Peter, Vanden-Eijnden, Eric, Albergo, Michael S.

arXiv.org Machine LearningApr-15-2026

The sequential nature of autoregressive next-token prediction imposes a fundamental speed limit on large language models. While continuous flow models offer a path to parallel generation, they traditionally demand expensive iterative integration. Flow Maps bypass this bottleneck by compressing generative trajectories into single-step mappings, theoretically enabling the generation of full text sequences from noise in a single forward pass. However, standard formulations rely on Euclidean regression losses that are geometrically ill-suited for discrete data. In this work, we resolve this conflict with Discrete Flow Maps, a framework that reconciles trajectory compression with the geometry of the probability simplex. We recast standard flow map training for the discrete domain, aligning the training dynamics with the discrete nature of language. Empirically, this strict geometric alignment allows our method to surpass previous state-of-the-art results in discrete flow modeling.

arxiv preprint arxiv, large language model, machine learning, (16 more...)

arXiv.org Machine Learning

2604.09784

Country:

North America > Canada > Ontario > Toronto (0.04)
Asia > Middle East > Syria (0.04)
North America > United States > New York > Kings County > New York City (0.04)
(2 more...)

Genre: Research Report (0.50)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Government > Military (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.66)

Add feedback

A Large-Scale Comparative Analysis of Imputation Methods for Single-Cell RNA Sequencing Data

Iwashita, Yuichiro, Abbasi, Ahtisham Fazeel, Kise, Koichi, Dengel, Andreas, Asim, Muhammad Nabeel

arXiv.org Machine LearningApr-15-2026

Background: Single-cell RNA sequencing (scRNA-seq) enables gene expression profiling at cellular resolution but is inherently affected by sparsity caused by dropout events, where expressed genes are recorded as zeros due to technical limitations. These artifacts distort gene expression distributions and compromise downstream analyses. Numerous imputation methods have been proposed to recover latent transcriptional signals. These methods range from traditional statistical models to deep learning (DL)-based methods. However, their comparative performance remains unclear, as existing benchmarks evaluate only a limited subset of methods, datasets, and downstream analyses. Results: We present a comprehensive benchmark of 15 scRNA-seq imputation methods spanning 7 methodological categories, including traditional and DL-based methods. Methods are evaluated across 30 datasets from 10 experimental protocols on 6 downstream analyses. Results show that traditional methods, such as model-based, smoothing-based, and low-rank matrix-based methods, generally outperform DL-based methods, including diffusion-based, GAN-based, GNN-based, and autoencoder-based methods. In addition, strong performance in numerical gene expression recovery does not necessarily translate into improved biological interpretability in downstream analyses, including cell clustering, differential expression analysis, marker gene analysis, trajectory analysis, and cell type annotation. Furthermore, method performance varies substantially across datasets, protocols, and downstream analyses, with no single method consistently outperforming others. Conclusions: Our findings provide practical guidance for selecting imputation methods tailored to specific analytical objectives and underscore the importance of task-specific evaluation when assessing imputation performance in scRNA-seq data analysis.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Machine Learning

2603.24626

Country:

Europe > Germany > Rhineland-Palatinate > Kaiserslautern (0.04)
North America > United States > New York > New York County > New York City (0.04)
Europe > Netherlands > South Holland > Leiden (0.04)
(4 more...)

Genre: Research Report > New Finding (0.86)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Oncology (0.67)
Health & Medicine > Therapeutic Area > Immunology (0.67)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Add feedback

Offline-Online Reinforcement Learning for Linear Mixture MDPs

Zhang, Zhongjun, Sinclair, Sean R.

arXiv.org Machine LearningApr-15-2026

We study offline-online reinforcement learning in linear mixture Markov decision processes (MDPs) under environment shift. In the offline phase, data are collected by an unknown behavior policy and may come from a mismatched environment, while in the online phase the learner interacts with the target environment. We propose an algorithm that adaptively leverages offline data. When the offline data are informative, either due to sufficient coverage or small environment shift, the algorithm provably improves over purely online learning. When the offline data are uninformative, it safely ignores them and matches the online-only performance. We establish regret upper bounds that explicitly characterize when offline data are beneficial, together with nearly matching lower bounds. Numerical experiments further corroborate our theoretical findings.

machine learning, reinforcement learning, zhangandsinclair, (20 more...)

arXiv.org Machine Learning

2604.11994

Country: Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.49)

Industry: Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Data Science > Data Mining (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.48)
(2 more...)

Add feedback