AITopics | Optimization

Unraveling the Gradient Descent Dynamics of Transformers

Neural Information Processing SystemsOct-10-2025, 12:33:53 GMT

By analyzing the loss landscape of a single Transformer layer using Softmax and Gaussian attention kernels, our work provides concrete answers to these questions.

equation, inequality, transformer, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > Minnesota (0.04)
North America > United States > California > Santa Clara County > Stanford (0.04)
Asia > Middle East > Jordan (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.87)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.67)

Add feedback

a6e1f6963f65bcc4854691a15460dbd8-Paper-Conference.pdf

Neural Information Processing SystemsOct-10-2025, 12:25:05 GMT

algorithm, constraint violation, theorem 3, (15 more...)

Neural Information Processing Systems

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Asia > Singapore (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)

Add feedback

Rethinking 3D Convolution in ℓ p-norm Space

Neural Information Processing SystemsOct-10-2025, 12:17:26 GMT

Convolution is a fundamental operation in the 3D backbone.

convolution, experiment, justification, (15 more...)

Neural Information Processing Systems

Country:

Asia > China > Anhui Province > Hefei (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Greece > Attica > Athens (0.04)
(3 more...)

Genre: Research Report > Experimental Study (0.93)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Data Science > Data Mining (0.93)
(5 more...)

Add feedback

a5321f64005b0d4a94d0b18e84e19f48-Paper-Conference.pdf

Neural Information Processing SystemsOct-10-2025, 12:17:01 GMT

dataset, experiment, optimization, (15 more...)

Neural Information Processing Systems

Country:

Asia > Singapore (0.04)
North America > Canada > British Columbia > Vancouver (0.04)
Europe > United Kingdom > England > Hampshire > Southampton (0.04)
Asia > Middle East > Jordan (0.04)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.93)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)

Add feedback

Stochastic Newton Proximal Extragradient Method

Neural Information Processing SystemsOct-10-2025, 12:16:35 GMT

However, these methods typically reach superlinear convergence only when the stochastic Hessian noise diminishes, increasing per-iteration costs over time.

convergence, inequality, iteration, (17 more...)

Neural Information Processing Systems

Country: North America > United States > Michigan (0.04)

Genre: Research Report > Experimental Study (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)

Add feedback

a5059a9a389ccc76da85760ea79490d8-Paper-Conference.pdf

Neural Information Processing SystemsOct-10-2025, 12:10:39 GMT

arxiv preprint arxiv, diffusion model, guidance, (13 more...)

Neural Information Processing Systems

Country:

Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)

Genre:

Research Report > Experimental Study (0.93)
Overview (0.92)

Industry:

Information Technology (0.46)
Health & Medicine (0.45)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
Information Technology > Artificial Intelligence > Natural Language (0.93)
(2 more...)

Add feedback

Multidimensional Fractional Programming for Normalized Cuts Y annan Chen 1 Beichen Huang

Neural Information Processing SystemsOct-10-2025, 11:59:34 GMT

FP method and the minorization-maximization theory to verify the convergence.

algorithm, dataset, ncut problem, (17 more...)

Neural Information Processing Systems

Country:

Asia > China > Guangdong Province > Shenzhen (0.04)
North America > United States > Massachusetts > Plymouth County > Norwell (0.04)
Asia > Middle East > Jordan (0.04)
(3 more...)

Genre: Research Report > Experimental Study (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.93)
Information Technology > Data Science > Data Mining (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)

Add feedback

a28af221f2f70be183afc16797a56b91-Paper-Conference.pdf

Neural Information Processing SystemsOct-10-2025, 11:53:41 GMT

constraint, diffusion model, experiment, (17 more...)

Neural Information Processing Systems

Country: North America > United States > Virginia (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Information Technology (0.46)
Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(3 more...)

Add feedback

Molecule Design by Latent Prompt Transformer

Neural Information Processing SystemsOct-10-2025, 11:52:32 GMT

This work explores the challenging problem of molecule design by framing it as a conditional generative modeling task, where target biological properties or desired chemical constraints serve as conditioning variables.

lpt, molecule, optimization, (13 more...)

Neural Information Processing Systems

Country: