Goto

Collaborating Authors

 eigendecomposition



Amortized Eigendecomposition for Neural Networks

Neural Information Processing Systems

Performing eigendecomposition during neural network training is essential for tasks such as dimensionality reduction, network compression, image denoising, and graph learning. However, eigendecomposition is computationally expensive as it is orders of magnitude slower than other neural network operations. To address this challenge, we propose a novel approach called amortized eigendecomposition that relaxes the exact eigendecomposition by introducing an additional loss term called eigen loss. Our approach offers significant speed improvements by replacing the computationally expensive eigendecomposition with a more affordable QR decomposition at each iteration. Theoretical analysis guarantees that the desired eigenpair is attained as optima of the eigen loss. Empirical studies on nuclear norm regularization, latent-space principal component analysis, and graphs adversarial learning demonstrate significant improvements in training efficiency while producing nearly identical outcomes to conventional approaches. This novel methodology promises to integrate eigendecomposition efficiently into neural network training, overcoming existing computational challenges and unlocking new potential for advanced deep learning applications.


Solving Interpretable Kernel Dimensionality Reduction

Neural Information Processing Systems

Kernel dimensionality reduction (KDR) algorithms find a low dimensional representation of the original data by optimizing kernel dependency measures that are capable of capturing nonlinear relationships. The standard strategy is to first map the data into a high dimensional feature space using kernels prior to a projection onto a low dimensional space. While KDR methods can be easily solved by keeping the most dominant eigenvectors of the kernel matrix, its features are no longer easy to interpret. Alternatively, Interpretable KDR (IKDR) is different in that it projects onto a subspace \textit{before} the kernel feature mapping, therefore, the projection matrix can indicate how the original features linearly combine to form the new features. Unfortunately, the IKDR objective requires a non-convex manifold optimization that is difficult to solve and can no longer be solved by eigendecomposition.




Response to Reviewer # 1

Neural Information Processing Systems

"I would like to see Theorem 4 reworded. It assumes that the underlying process has a correct clustering of states?" Thm 4 assumes there is an underlying partition that attains the smallest value of distortion (eq(4)). We will reword Thm 4 to make it easier to interpret. "How to find state pairs in DQN analysis" Then we screened the top 100 closest pairs and pick those with large raw-data distances.




Solving Interpretable Kernel Dimension Reduction

Neural Information Processing Systems

Kernel dimensionality reduction (KDR) algorithms find a low dimensional representation of the original data by optimizing kernel dependency measures that are capable of capturing nonlinear relationships. The standard strategy is to first map the data into a high dimensional feature space using kernels prior to a projection onto a low dimensional space.


We thank all the reviewers for their insightful and encouraging feedback

Neural Information Processing Systems

We thank all the reviewers for their insightful and encouraging feedback. Due to the discrete nature of COMBO's search space, the implementation detail is slightly different. In contrast, in COMBO's combinatorial graphs, we have spray vertices. As R2 suggested, this heuristic promotes exploitation. Using random vertices for exploration is similar to Spearmint.