Goto

Collaborating Authors

 positive semidefinite





BernNet: LearningArbitraryGraphSpectralFilters viaBernsteinApproximation

Neural Information Processing Systems

Graph neural networks (GNNs) have received extensive attention from researchers due to their excellent performance on various graph learning tasks such as social analysis [24, 17, 29], drug discovery [12, 25], traffic forecasting [18, 3, 6], recommendation system [38, 32] and computer vision[39,4].


Beyond Predictive Uncertainty: Reliable Representation Learning with Structural Constraints

Yang, Yiyao

arXiv.org Machine Learning

Uncertainty estimation in machine learning has traditionally focused on the prediction stage, aiming to quantify confidence in model outputs while treating learned representations as deterministic and reliable by default. In this work, we challenge this implicit assumption and argue that reliability should be regarded as a first-class property of learned representations themselves. We propose a principled framework for reliable representation learning that explicitly models representation-level uncertainty and leverages structural constraints as inductive biases to regularize the space of feasible representations. Our approach introduces uncertainty-aware regularization directly in the representation space, encouraging representations that are not only predictive but also stable, well-calibrated, and robust to noise and structural perturbations. Structural constraints, such as sparsity, relational structure, or feature-group dependencies, are incorporated to define meaningful geometry and reduce spurious variability in learned representations, without assuming fully correct or noise-free structure. Importantly, the proposed framework is independent of specific model architectures and can be integrated with a wide range of representation learning methods.


Efficiently escaping saddle points on manifolds

Neural Information Processing Systems

Smooth, non-convex optimization problems on Riemannian manifolds occur in machine learning as a result of orthonormality, rank or positivity constraints. First-and second-order necessary optimality conditions state that the Riemannian gradient must be zero, and the Riemannian Hessian must be positive semidefinite. Generalizing Jin et al.'s recent work on perturbed gradient descent (PGD) for optimization on linear spaces [How to Escape Saddle Points Efficiently (2017), Stochastic Gradient Descent Escapes Saddle Points Efficiently (2019)], we study a version of perturbed Riemannian gradient descent (PRGD) to show that necessary optimality conditions can be met approximately with high probability, without evaluating the Hessian. Specifically, for an arbitrary Riemannian manifold $\mathcal{M}$ of dimension $d$, a sufficiently smooth (possibly non-convex) objective function $f$, and under weak conditions on the retraction chosen to move on the manifold, with high probability, our version of PRGD produces a point with gradient smaller than $\epsilon$ and Hessian within $\sqrt{\epsilon}$ of being positive semidefinite in $O((\log{d})^4 / \epsilon^{2})$ gradient queries. This matches the complexity of PGD in the Euclidean case. Crucially, the dependence on dimension is low, which matters for large-scale applications including PCA and low-rank matrix completion, which both admit natural formulations on manifolds. The key technical idea is to generalize PRGD with a distinction between two types of gradient steps: ``steps on the manifold'' and ``perturbed steps in a tangent space of the manifold.'' Ultimately, this distinction makes it possible to extend Jin et al.'s analysis seamlessly.



2433fec2144ccf5fea1c9c5ebdbc3924-Supplemental-Conference.pdf

Neural Information Processing Systems

A Proof of the object in Equation 3 is convex, when α is sufficiently small. To validate this statement, we first prove two factors in the object are convex (Lemma A.1 and Lemma A.2) and the combination of them keeps the convex property (Lemma A.3). Lemma A.1. P and Q are positive semidefinite indicates that i [ [1..N ] ], 0 λ Thus, P α Q is positive semidefinite. Combining Lemma A.1, Lemma A.2 and Lemma A.3, the objective of Equation 3 is convex when In addition, to avoid replacement clash, we do not allow any word to appear in more than word set. Eventually, top 50 semantically matching pairs are retained for CA TER.


Appendix A Notations We will use

Neural Information Processing Systems

Lipschitz constant of rr ' (, x) in, which is also finite under Assumption 2.2. The following notations will be used in Appendix E.2: Assuming that