AITopics | neumann series

Appendix of Joint Data-T ask Generation for Auxiliary Learning Hong Chen

Neural Information Processing SystemsFeb-9-2026, 09:36:32 GMT

We provide the derivation of the upper implicit gradient in eq. We summarize the whole DTG-AuxL algorithm in Algorithm 1, where the lower and upper optimization updates are conducted alternatingly. We use the batch stochastic gradient optimization for both the lower and upper update. STL: It is a natural baseline where we only train on the primary task. Equal: It is a multi-task learning method, where we assign an equal weight of 1.0 to the loss of each MAXL can be only applied to the classification problem.

artificial intelligence, dataset, machine learning, (18 more...)

Neural Information Processing Systems

Country:

Asia > China > Beijing > Beijing (0.04)
North America > United States > Utah > Salt Lake County > Salt Lake City (0.04)
North America > United States > Maryland > Baltimore (0.04)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.34)

Add feedback

On Training Implicit Models

Neural Information Processing SystemsNov-15-2025, 16:52:20 GMT

This paper focuses on training implicit models of infinite layers. Specifically, previous works employ implicit differentiation and solve the exact gradient for the backward propagation. However, is it necessary to compute such an exact but expensive gradient for training?

artificial intelligence, gradient, machine learning, (13 more...)

Neural Information Processing Systems

Country: Asia > China (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Vision (0.68)

Add feedback

Appendix of Joint Data-T ask Generation for Auxiliary Learning Hong Chen

Neural Information Processing SystemsOct-8-2025, 08:35:50 GMT

We provide the derivation of the upper implicit gradient in eq. We summarize the whole DTG-AuxL algorithm in Algorithm 1, where the lower and upper optimization updates are conducted alternatingly. We use the batch stochastic gradient optimization for both the lower and upper update. STL: It is a natural baseline where we only train on the primary task. Equal: It is a multi-task learning method, where we assign an equal weight of 1.0 to the loss of each MAXL can be only applied to the classification problem.

artificial intelligence, dataset, machine learning, (18 more...)

Neural Information Processing Systems

Country:

Asia > China > Beijing > Beijing (0.04)
North America > United States > Utah > Salt Lake County > Salt Lake City (0.04)
North America > United States > Maryland > Baltimore (0.04)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.34)

Add feedback

Conditional Independence Estimates for the Generalized Nonparanormal

Shah, Ujas, Lladser, Manuel, Morrison, Rebecca

arXiv.org Machine LearningAug-18-2025

For general non-Gaussian distributions, the covariance and precision matrices do not encode the independence structure of the variables, as they do for the multivariate Gaussian. This paper builds on previous work to show that for a class of non-Gaussian distributions -- those derived from diagonal transformations of a Gaussian -- information about the conditional independence structure can still be inferred from the precision matrix, provided the data meet certain criteria, analogous to the Gaussian case. We call such transformations of the Gaussian as the generalized nonparanormal. The functions that define these transformations are, in a broad sense, arbitrary. We also provide a simple and computationally efficient algorithm that leverages this theory to recover conditional independence structure from the generalized nonparanormal data. The effectiveness of the proposed algorithm is demonstrated via synthetic experiments and applications to real-world data.

algorithm, artificial intelligence, machine learning, (16 more...)

arXiv.org Machine Learning

2508.1105

Country:

North America > United States > Colorado > Boulder County > Boulder (0.14)
Europe > United Kingdom > England > Greater Manchester > Rochdale (0.04)
Europe > Ireland (0.04)

Genre: Research Report (0.64)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

cb8da6767461f2812ae4290eac7cbc42-Paper.pdf

Neural Information Processing SystemsAug-17-2025, 09:34:48 GMT

artificial intelligence, gradient, machine learning, (13 more...)

Neural Information Processing Systems

Country: Asia > China (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Vision (0.68)

Add feedback

Neumann Series-based Neural Operator for Solving Inverse Medium Problem

Liu, Ziyang, Chen, Fukai, Chen, Junqing, Qiu, Lingyun, Shi, Zuoqiang

arXiv.org Artificial IntelligenceSep-14-2024

The inverse medium problem, inherently ill-posed and nonlinear, presents significant computational challenges. This study introduces a novel approach by integrating a Neumann series structure within a neural network framework to effectively handle multiparameter inputs. Experiments demonstrate that our methodology not only accelerates computations but also significantly enhances generalization performance, even with varying scattering properties and noisy data. The robustness and adaptability of our framework provide crucial insights and methodologies, extending its applicability to a broad spectrum of scattering problems. These advancements mark a significant step forward in the field, offering a scalable solution to traditionally complex inverse problems.

equation, neumann series, scatterer, (15 more...)

arXiv.org Artificial Intelligence

2409.0948

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Data Science (0.87)

Add feedback

On Training Implicit Meta-Learning With Applications to Inductive Weighing in Consistency Regularization

Rezk, Fady

arXiv.org Artificial IntelligenceOct-28-2023

Meta-learning that uses implicit gradient have provided an exciting alternative to standard techniques which depend on the trajectory of the inner loop training. Implicit meta-learning (IML), however, require computing $2^{nd}$ order gradients, particularly the Hessian which is impractical to compute for modern deep learning models. Various approximations for the Hessian were proposed but a systematic comparison of their compute cost, stability, generalization of solution found and estimation accuracy were largely overlooked. In this study, we start by conducting a systematic comparative analysis of the various approximation methods and their effect when incorporated into IML training routines. We establish situations where catastrophic forgetting is exhibited in IML and explain their cause in terms of the inability of the approximations to estimate the curvature at convergence points. Sources of IML training instability are demonstrated and remedied. A detailed analysis of the effeciency of various inverse Hessian-vector product approximation methods is also provided. Subsequently, we use the insights gained to propose and evaluate a novel semi-supervised learning algorithm that learns to inductively weigh consistency regularization losses. We show how training a "Confidence Network" to extract domain specific features can learn to up-weigh useful images and down-weigh out-of-distribution samples. Results outperform the baseline FixMatch performance.

artificial intelligence, machine learning, mb numn, (18 more...)

arXiv.org Artificial Intelligence

2310.18741

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(3 more...)

Genre: Research Report > New Finding (0.48)

Industry: Education (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Reviving and Improving Recurrent Back-Propagation

Liao, Renjie, Xiong, Yuwen, Fetaya, Ethan, Zhang, Lisa, Yoon, KiJung, Pitkow, Xaq, Urtasun, Raquel, Zemel, Richard

arXiv.org Machine LearningMar-23-2018

In this paper, we revisit the recurrent backpropagation (RBP) algorithm (Almeida, 1987; Pineda, 1987), discuss the conditions under which it applies as well as how to satisfy them in deep neural networks. We show that RBP can be unstable and propose two variants based on conjugate gradient on the normal equations (CG-RBP) and Neumann series (Neumann-RBP). We further investigate the relationship between Neumann-RBP and back propagation through time (BPTT) and its truncated version (TBPTT). Our Neumann-RBP has the same time complexity as TBPTT but only requires constant memory, whereas TBPTT's memory cost scales linearly with the number of truncation steps. We examine all RBP variants along with BPTT and TBPTT in three different application domains: associative memory with continuous Hopfield networks, document classification in citation networks using graph neural networks and hyperparameter optimization for fully connected networks. All experiments demonstrate that RBPs, especially the Neumann-RBP variant, are efficient and effective for optimizing convergent recurrent neural networks.

artificial intelligence, deep learning, machine learning, (16 more...)

arXiv.org Machine Learning

1803.06396

Country: