AITopics | Genre

Collaborating Authors

Genre

Asymptotics of SGD in Sequence-Single Index Models and Single-Layer Attention Networks

Neural Information Processing SystemsJun-15-2026, 11:42:25 GMT

We study the dynamics of stochastic gradient descent (SGD) for a class of sequence models termed Sequence Single-Index (SSI) models, where the target depends on a single direction in input space applied to a sequence of tokens. This setting generalizes classical single-index models to the sequential domain, encompassing simplified one-layer attention architectures. We derive a closed-form expression for the population loss in terms of a pair of sufficient statistics capturing semantic and positional alignment, and characterize the induced high-dimensional SGD dynamics for these coordinates. Our analysis reveals two distinct training phases: escape from uninformative initialization and alignment with the target subspace, and demonstrates how the sequence length and positional encoding influence convergence speed and learning trajectories. These results provide a rigorous and interpretable foundation for understanding how sequential structure in data can be beneficial for learning with attention-based models. Stochastic Gradient Descent (SGD) is the core optimization tool driving modern machine learning. Recent years have seen substantial progress in understanding its dynamics, particularly in two-layer networks [Saad and Solla, 1995, Mei et al., 2018, Chizat and Bach, 2018, Rotskoff and VandenEijnden, 2022, Sirignano and Spiliopoulos, 2020, Arnaboldi et al., 2023a]. While global convergence is qualitatively well-understood when the network is wide enough, quantitative results are scarcer. A particularly fruitful body of recent theoretical work addressing this gap has focused on deriving precise convergence rates for particular model classes on synthetic data, such as high-dimensional Gaussian single and multi-index models [Ben Arous et al., 2021, Abbe et al., 2022, 2023].

artificial intelligence, machine learning, sie, (17 more...)

Neural Information Processing Systems

Country:

Europe > France (0.46)
North America > United States (0.45)
Africa > Middle East > Tunisia > Ben Arous Governorate > Ben Arous (0.25)

Genre: Research Report > Experimental Study (1.00)

Industry: Government (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

Is this the key to preventing a Super El Niño? Scientists want to dim the SUN to shield the oceans from heatwaves

Daily Mail - Science & techJun-15-2026, 11:42:21 GMT

Former Olympian seen in handcuffs as Trump threatens'years in jail' and more arrests after vandals SABOTAGE Reflecting Pool with'corrosive and destructive chemicals' Angelina Jolie's son Pax, 22, surfaces in LA after bombshell revelation about his relationship to Brad Pitt Mortifying truth about Clavicular's'botched' nose job: Infertile influencer's'trans' admission to friends... as insider reveals what's said behind closed doors - and twisted secrets that'll leave fans floored Keir Starmer'will announce as early as Monday that he is quitting as Prime Minister' after spending weekend locked in tense talks about his future with his wife Victoria at Chequers Inside America's new fattest town: Burgers are the size of your head, gyms lie empty and custom mobility scooters carry 800lb loads... as we investigate why Ozempic just DOESN'T work Call me cynical, but the real reason Gruesome Twosome Harry and Meghan are returning to the UK is just so obvious... and highly humiliating: MAUREEN CALLAHAN I lost 50lb without jabs using this easy but overlooked method. But I still felt dowdy - until I discovered these expert anti-ageing fashion and beauty tips. No one can see the real reason Jelly Roll divorced Bunnie XO. Wyndham Clark's stunning girlfriend pays tribute to polarizing golfer as he stands on the brink of US Open glory TV star mom, 46, who appeared on'quitting everything to change your life' show died in fire at luxury Caribbean beach resort that sent 1,700 tourists running for their lives Giorgia Meloni rips'senseless' attacks from Trump as Italian Prime Minister refuses to back down amid G7 feud Scientists propose radical new theory of consciousness - and claim it doesn't depend on flesh and blood Embattled Alexi Lalas makes controversial World Cup declaration amid tension with Fox colleagues: 'Makes you look like a weak poser' Stingy fast food giant named America's favorite restaurant AGAIN... and experts think they know why Blake Lively runs errands in frumpy outfit after reconciling with ex-BFF Taylor Swift... miles away from reported'bachelorette party' Grace Kelly's lookalike granddaughter, 27, wows in bikini snaps...as she packs on the PDA during beach getaway Is this the key to preventing a Super El Niño? As scientists warn that the coming Super El Niño could be the worst in recorded history, one group of researchers has proposed a drastic solution.

artificial intelligence, jelly roll, social media, (16 more...)

Daily Mail - Science & tech

Country:

North America > United States (1.00)
Africa (1.00)
Asia (0.93)
Europe > United Kingdom > England (0.46)

Genre: Personal (0.68)

Industry:

Media > Television (1.00)
Media > Music (1.00)
Media > Film (1.00)
(5 more...)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence (0.88)

Add feedback

Training-free Detection of AI-generated images via Cropping Robustness

Neural Information Processing SystemsJun-15-2026, 11:42:05 GMT

AI-generated image detection has become crucial with the rapid advancement of vision-generative models. Instead of training detectors tailored to specific datasets, we study a training-free approach leveraging self-supervised models without requiring prior data knowledge. These models, pre-trained with augmentations like RandomResizedCrop, learn to produce consistent representations across varying resolutions. Motivated by this, we propose WaRPAD, a training-free AI-generated image detection algorithm based on self-supervised models. Since neighborhood pixel differences in images are highly sensitive to resizing operations, WaRPAD first defines a base score function that quantifies the sensitivity of image embeddings to perturbations along high-frequency directions extracted via Haar wavelet decomposition.

benchmark, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country: North America > United States (0.46)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Industry: Information Technology (0.47)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.46)

Add feedback

Learning from Videos for 3D World: Enhancing MLLMs with 3D Vision Geometry Priors

Neural Information Processing SystemsJun-15-2026, 11:41:00 GMT

Previous research has investigated the application of Multimodal Large Language Models (MLLMs) in understanding 3D scenes by interpreting them as videos. These approaches generally depend on comprehensive 3D data inputs, such as point clouds or reconstructed Bird's-Eye View (BEV) maps. In our research, we advance this field by enhancing the capability of MLLMs to understand and reason in 3D spaces directly from video data, without the need for additional 3D input. We propose a novel and efficient method called the Video-3D Geometry Large Language Model (VG LLM). Our approach utilizes a 3D visual geometry encoder to extract 3D prior information from video sequences. This information is then integrated with visual tokens and input into the MLLM. Extensive experiments have shown that our method has achieved substantial improvements in various tasks related to 3D scene understanding and spatial reasoning, all directly learned from video sources. Impressively, our 4B model, which does not rely on explicit 3D data inputs, achieves competitive results compared to existing state-of-the-art methods, and even surpasses the Gemini-1.5-Pro in the VSI-Bench evaluations.

artificial intelligence, large language model, natural language, (6 more...)

Neural Information Processing Systems

Genre: Research Report (0.60)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

1d83ad88759cef8192451543e5d59bf6-Paper-Conference.pdf

Neural Information Processing SystemsJun-15-2026, 11:40:48 GMT

Recent reasoning adv through ances in the lar ef ge fecti language ve use of models Chain-of-Thought have significantly (CoT) and impro reinforcement ved textual learning.

arxiv preprint arxiv, large language model, machine learning, (20 more...)

Neural Information Processing Systems

Country:

Europe (0.48)
Asia > China (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Education > Educational Setting (0.47)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Add feedback

Improved Bounds for Swap Multicalibration and Swap Omniprediction

Neural Information Processing SystemsJun-15-2026, 11:30:20 GMT

In this paper, we consider the related problems of multicalibration -- a multigroup fairness notion and omniprediction -- a simultaneous loss minimization paradigm, both in the distributional and online settings. The recent work of Garg et al. (2024) raised the open problem of whether it is possible to efficiently achieve O( T) ℓ2-multicalibration error against bounded linear functions. In this paper, we answer this question in a strongly affirmative sense.

artificial intelligence, machine learning, multicalibration, (16 more...)

Neural Information Processing Systems

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.92)

Add feedback

1d6817cf271f0da4efbb5fde96ff52b3-Paper-Conference.pdf

Neural Information Processing SystemsJun-15-2026, 11:29:59 GMT

Rotary Positional Encodings (RoPE) have emerged as a highly effective technique for one-dimensional sequences in Natural Language Processing spurring recent progress towards generalizing RoPE to higher-dimensional data such as images and videos. The success of RoPE has been thought to be due to its positional equivariance, i.e. its status as a relative positional encoding. In this paper, we mathematically show RoPE to be one of the most general solutions for equivariant positional embedding in one-dimensional data. Moreover, we show Mixed RoPE to be the analogously general solution for M-dimensional data, if we require commutative generators - a property necessary for RoPE's equivariance. However, we question whether strict equivariance plays a large role in RoPE's performance. We propose Spherical RoPE, a method analogous to Mixed RoPE, but assumes non-commutative generators. Empirically, we find Spherical RoPE to have the equivalent or better learning behavior compared to its equivariant analogues. This suggests that relative positional embeddings are not as important as is commonly believed, at least within computer vision. We expect this discovery to facilitate future work in positional encodings for vision that can be faster and generalize better by removing the preconception that they must be relative.

frequency, large language model, machine learning, (19 more...)

Neural Information Processing Systems

Country: Europe > Germany (0.28)

Genre: Research Report > Experimental Study (1.00)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)

Add feedback

1d5c1d0d32666a4c2568dab0aeb0f0fe-Paper-Conference.pdf

Neural Information Processing SystemsJun-15-2026, 11:28:44 GMT

Diffusion-based purification (DBP) has become a cornerstone defense against adversarial examples (AEs), regarded as robust due to its use of diffusion models (DMs) that project AEs onto the natural data manifold. We refute this core claim, theoretically proving that gradient-based attacks effectively target the DM rather than the classifier, causing DBP's outputs to align with adversarial distributions. This prompts a reassessment of DBP's robustness, accrediting it two critical factors: inaccurate gradients and improper evaluation protocols that test only a single random purification of the AE. We show that when accounting for stochasticity and resubmission risk, DBP collapses. To support this, we introduce DiffBreak, the first reliable toolkit for differentiation through DBP, eliminating gradient mismatches that previously further inflated robustness estimates. We also analyze the current defense scheme used for DBP where classification relies on a single purification, pinpointing its inherent invalidity. We provide a statistically grounded majorityvote (MV) alternative that aggregates predictions across multiple purified copies, showing partial but meaningful robustness gain. We then propose a novel adaptation of an optimization method against deepfake watermarking, crafting systemic perturbations that defeat DBP even under MV, challenging DBP's viability.

artificial intelligence, gradient, machine learning, (17 more...)

Neural Information Processing Systems

Country: North America > Canada (0.27)

Genre:

Workflow (1.00)
Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Absorb and Converge: Provable Convergence Guarantee for Absorbing Discrete Diffusion Models

Neural Information Processing SystemsJun-15-2026, 11:20:18 GMT

Discrete state space diffusion models have shown significant advantages in applications involving discrete data, such as text and image generation. It has also been observed that their performance is highly sensitive to the choice of rate matrices, particularly between uniform and absorbing rate matrices. While empirical results suggest that absorbing rate matrices often yield better generation quality compared to uniform rate matrices, existing theoretical works have largely focused on the uniform rate matrices case. Notably, convergence guarantees and error analyses for absorbing diffusion models are still missing. In this work, we provide the first finite-time error bounds and convergence rate analysis for discrete diffusion models using absorbing rate matrices.

artificial intelligence, diffusion model, machine learning, (18 more...)

Neural Information Processing Systems

Country: North America > United States (0.67)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Government (0.93)
Health & Medicine > Pharmaceuticals & Biotechnology (0.45)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)

Add feedback

Modeling Dynamic Neural Activity by Combining Naturalistic Video Stimuli and Stimulus-Independent Latent Factors

Neural Information Processing SystemsJun-15-2026, 11:19:58 GMT

The neural activity in the visual processing is influenced by both external stimuli and internal brain states. Ideally, a neural predictive model should account for both of them. Currently, there are no dynamic encoding models that explicitly model a latent state and the entire neuronal response distribution. We address this gap by proposing a probabilistic model that predicts the joint distribution of the neuronal responses from video stimuli and stimulus-independent latent factors. After training and testing our model on mouse V1 neuronal responses, we find that it outperforms video-only models in terms of log-likelihood and achieves improvements in likelihood and correlation when conditioned on responses from other neurons. Furthermore, we find that the learned latent factors strongly correlate with mouse behavior and that they exhibit patterns related to the neurons' position on the visual cortex, although the model was trained without behavior and cortical coordinates. Our findings demonstrate that unsupervised learning of latent factors from population responses can reveal biologically meaningful structure that bridges sensory processing and behavior, without requiring explicit behavioral annotations during training.

artificial intelligence, data mining, machine learning, (17 more...)

Neural Information Processing Systems

Country: Europe > Germany > Lower Saxony (0.28)

Genre: