AITopics | adam 0

Collaborating Authors

adam 0

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Re-Think and Re-Design Graph Neural Networks in Spaces of Continuous Graph Diffusion Functionals

Neural Information Processing SystemsApr-29-2026, 13:56:28 GMT

S1.1 Step-by-step derivation of min-max optimization in Section 2.2.1 By substituting Eq. 2 into Eq. 1 in the main manuscript, we can obtain the objective function of subscript z (we temporarily drop ifor clarity): J(z) = max Since z might be in high dimensional space, solving such a large system of linear equations under the constraint |z| 1is oftentimes computationally challenging. In order to find a practical solution for z that satisfies the constrained minimization problem in Eq. By setting zl as point of coincidence, we can find a separable majorizer of M(z) by adding the non-negative function (z zl) (βI Gx Gx)(z zl) (S6) 37th Conference on Neural Information Processing Systems (NeurIPS 2023). Note, to unify the format, we use the matrix transpose property in Eq. Then, the next step is to find z RN that minimizes z z 2bz subject to the constraint |z| 1. Let's first consider the simplest case where z is a scalar: argmin If b 1, then the solution is z = b.

artificial intelligence, dimension, machine learning, (16 more...)

Neural Information Processing Systems

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Diagnostic Medicine (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.51)

Add feedback

Re-Think and Re-Design Graph Neural Networks in Spaces of Continuous Graph Diffusion Functionals Tingting Dan

Neural Information Processing SystemsFeb-16-2026, 19:08:23 GMT

Since the PDE in Eq. 5 in the main manuscript is equivalent to the E-L equation of the quadratic

artificial intelligence, dimension, machine learning, (15 more...)

Neural Information Processing Systems

Country: North America > United States > North Carolina (0.05)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Diagnostic Medicine (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.51)

Add feedback

67d96d458abdef21792e6d8e590244e7-Supplemental.pdf

Neural Information Processing SystemsFeb-9-2026, 03:31:07 GMT

Recall C(xt|(1 t)xt 1+ t/K)fordiffxt, which (1 t)+ t/K if xt = xt 1 and t/K otherwise.

artificial intelligence, machine learning, text diffusion, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.34)

Add feedback

149ef6419512be56a93169cd5e6fa8fd-Supplemental.pdf

Neural Information Processing SystemsFeb-7-2026, 14:02:51 GMT

pmnist classification task, rmsprop 0, sequence, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

Add feedback

A Convexity-dependent Two-Phase Training Algorithm for Deep Neural Networks

Hrycej, Tomas, Bermeitinger, Bernhard, Pavone, Massimo, Wiegand, Götz-Henrik, Handschuh, Siegfried

arXiv.org Artificial IntelligenceOct-31-2025

The key task of machine learning is to minimize the loss function that measures the model fit to the training data. The numerical methods to do this efficiently depend on the properties of the loss function. The most decisive among these properties is the convexity or non-convexity of the loss function. The fact that the loss function can have, and frequently has, non-convex regions has led to a widespread commitment to non-convex methods such as Adam. However, a local minimum implies that, in some environment around it, the function is convex. In this environment, second-order minimizing methods such as the Conjugate Gradient (CG) give a guaranteed superlinear convergence. We propose a novel framework grounded in the hypothesis that loss functions in real-world tasks swap from initial non-convexity to convexity towards the optimum. This is a property we leverage to design an innovative two-phase optimization algorithm. The presented algorithm detects the swap point by observing the gradient norm dependence on the loss. In these regions, non-convex (Adam) and convex (CG) algorithms are used, respectively. Computing experiments confirm the hypothesis that this simple convexity structure is frequent enough to be practically exploited to substantially improve convergence and accuracy.

algorithm, artificial intelligence, machine learning, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.5220/0013696100004000

2510.25366

Country: Europe > Austria (0.28)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.64)

Add feedback

Appendix for Integrating Momentum into Recurrent Neural Networks

Neural Information Processing SystemsOct-2-2025, 04:41:06 GMT

Section 3.1, we flatten and process the image as a sequence of the length of 784 pixel-by-pixel. The baseline LSTM models consist of one LSTM cell with 128 and 256 hidden units. Orthogonal initialization is used for input-to-hidden weights, while hidden-to-hidden weights are initialized to identity matrices. The gradient norms are clipped to 1 during training. The log-magnitude of these sequences is fed into the models as the input data.

artificial intelligence, machine learning, rmsprop 0, (12 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Gradient Flow Matching for Learning Update Dynamics in Neural Network Training

Shou, Xiao, Ding, Yanna, Gao, Jianxi

arXiv.org Machine LearningMay-27-2025

Training deep neural networks remains computationally intensive due to the itera2 tive nature of gradient-based optimization. We propose Gradient Flow Matching (GFM), a continuous-time modeling framework that treats neural network training as a dynamical system governed by learned optimizer-aware vector fields. By leveraging conditional flow matching, GFM captures the underlying update rules of optimizers such as SGD, Adam, and RMSprop, enabling smooth extrapolation of weight trajectories toward convergence. Unlike black-box sequence models, GFM incorporates structural knowledge of gradient-based updates into the learning objective, facilitating accurate forecasting of final weights from partial training sequences. Empirically, GFM achieves forecasting accuracy that is competitive with Transformer-based models and significantly outperforms LSTM and other classical baselines. Furthermore, GFM generalizes across neural architectures and initializations, providing a unified framework for studying optimization dynamics and accelerating convergence prediction.

artificial intelligence, machine learning, trajectory, (18 more...)

arXiv.org Machine Learning

2505.20221

Country:

North America > United States > New York > Rensselaer County > Troy (0.04)
North America > United States > Texas > McLennan County > Waco (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

YOSO: You-Only-Sample-Once via Compressed Sensing for Graph Neural Network Training

Li, Yi, Guo, Zhichun, Li, Guanpeng, Li, Bingzhe

arXiv.org Artificial IntelligenceNov-8-2024

Graph neural networks (GNNs) have become essential tools for analyzing non-Euclidean data across various domains. During training stage, sampling plays an important role in reducing latency by limiting the number of nodes processed, particularly in large-scale applications. However, as the demand for better prediction performance grows, existing sampling algorithms become increasingly complex, leading to significant overhead. To mitigate this, we propose YOSO (You-Only-Sample-Once), an algorithm designed to achieve efficient training while preserving prediction accuracy. YOSO introduces a compressed sensing (CS)-based sampling and reconstruction framework, where nodes are sampled once at input layer, followed by a lossless reconstruction at the output layer per epoch. By integrating the reconstruction process with the loss function of specific learning tasks, YOSO not only avoids costly computations in traditional compressed sensing (CS) methods, such as orthonormal basis calculations, but also ensures high-probability accuracy retention which equivalent to full node participation. Experimental results on node classification and link prediction demonstrate the effectiveness and efficiency of YOSO, reducing GNN training by an average of 75\% compared to state-of-the-art methods, while maintaining accuracy on par with top-performing baselines.

adam 0, artificial intelligence, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2411.05693

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > Texas (0.04)
North America > United States > Iowa (0.04)

Genre:

Research Report > New Finding (0.46)
Research Report > Promising Solution (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Kernel Orthogonality does not necessarily imply a Decrease in Feature Map Redundancy in CNNs: Convolutional Similarity Minimization

Belmekki, Zakariae, Li, Jun, Reuter, Patrick, Jáuregui, David Antonio Gómez, Jenkins, Karl

arXiv.org Artificial IntelligenceNov-5-2024

Convolutional Neural Networks (CNNs) have been heavily used in Deep Learning due to their success in various tasks. Nonetheless, it has been observed that CNNs suffer from redundancy in feature maps, leading to inefficient capacity utilization. Efforts to mitigate and solve this problem led to the emergence of multiple methods, amongst which is kernel orthogonality through variant means. In this work, we challenge the common belief that kernel orthogonality leads to a decrease in feature map redundancy, which is, supposedly, the ultimate objective behind kernel orthogonality. We prove, theoretically and empirically, that kernel orthogonality has an unpredictable effect on feature map similarity and does not necessarily decrease it. Based on our theoretical result, we propose an effective method to reduce feature map similarity independently of the input of the CNN. This is done by minimizing a novel loss function we call Convolutional Similarity. Empirical results show that minimizing the Convolutional Similarity increases the performance of classification models and can accelerate their convergence. Furthermore, using our proposed method pushes towards a more efficient use of the capacity of models, allowing the use of significantly smaller models to achieve the same levels of performance.

arXiv.org Artificial Intelligence

2411.03226

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report > New Finding (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Filters

Collaborating Authors

adam 0

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Re-Think and Re-Design Graph Neural Networks in Spaces of Continuous Graph Diffusion Functionals

Re-Think and Re-Design Graph Neural Networks in Spaces of Continuous Graph Diffusion Functionals Tingting Dan

67d96d458abdef21792e6d8e590244e7-Supplemental.pdf

149ef6419512be56a93169cd5e6fa8fd-Supplemental.pdf

A Convexity-dependent Two-Phase Training Algorithm for Deep Neural Networks

3a077e8acfc4a2b463c47f2125fdfac5-Supplemental.pdf

Appendix for Integrating Momentum into Recurrent Neural Networks

Gradient Flow Matching for Learning Update Dynamics in Neural Network Training

YOSO: You-Only-Sample-Once via Compressed Sensing for Graph Neural Network Training

Kernel Orthogonality does not necessarily imply a Decrease in Feature Map Redundancy in CNNs: Convolutional Similarity Minimization