singular point
Learning on a Razor's Edge: the Singularity Bias of Polynomial Neural Networks
Shahverdi, Vahid, Marchetti, Giovanni Luca, Kohn, Kathlén
In this work, we theoretically analyze sub-networks and their bias through the lens of algebraic geometry. We consider fully-connected networks with polynomial activation functions, and focus on the geometry of the function space they parametrize, often referred to as neuroman-ifold. First, we compute the dimension of the subspace of the neuromanifold parametrized by subnetworks. Second, we show that this subspace is singular. Third, we argue that such singularities often correspond to critical points of the training dynamics. Lastly, we discuss convolutional networks, for which subnet-works and singularities are similarly related, but the bias does not arise.Figure 1: Subnetworks define singular points (orange) of the neuromanifold.
De-singularity Subgradient for the $q$-th-Powered $\ell_p$-Norm Weber Location Problem
Lai, Zhao-Rong, Wu, Xiaotian, Fang, Liangda, Chen, Ziliang, Li, Cheng
The Weber location problem is widely used in several artificial intelligence scenarios. However, the gradient of the objective does not exist at a considerable set of singular points. Recently, a de-singularity subgradient method has been proposed to fix this problem, but it can only handle the $q$-th-powered $\ell_2$-norm case ($1\leqslant q<2$), which has only finite singular points. In this paper, we further establish the de-singularity subgradient for the $q$-th-powered $\ell_p$-norm case with $1\leqslant q\leqslant p$ and $1\leqslant p<2$, which includes all the rest unsolved situations in this problem. This is a challenging task because the singular set is a continuum. The geometry of the objective function is also complicated so that the characterizations of the subgradients, minimum and descent direction are very difficult. We develop a $q$-th-powered $\ell_p$-norm Weiszfeld Algorithm without Singularity ($q$P$p$NWAWS) for this problem, which ensures convergence and the descent property of the objective function. Extensive experiments on six real-world data sets demonstrate that $q$P$p$NWAWS successfully solves the singularity problem and achieves a linear computational convergence rate in practical scenarios.
Manifold Learning via Foliations and Knowledge Transfer
Understanding how real data is distributed in high dimensional spaces is the key to many tasks in machine learning. We want to provide a natural geometric structure on the space of data employing a deep ReLU neural network trained as a classifier. Through the data information matrix (DIM), a variation of the Fisher information matrix, the model will discern a singular foliation structure on the space of data. We show that the singular points of such foliation are contained in a measure zero set, and that a local regular foliation exists almost everywhere. Experiments show that the data is correlated with leaves of such foliation. Moreover we show the potential of our approach for knowledge transfer by analyzing the spectrum of the DIM to measure distances between datasets.
Switched Flow Matching: Eliminating Singularities via Switching ODEs
Continuous-time generative models, such as Flow Matching (FM), construct probability paths to transport between one distribution and another through the simulation-free learning of the neural ordinary differential equations (ODEs). During inference, however, the learned model often requires multiple neural network evaluations to accurately integrate the flow, resulting in a slow sampling speed. We attribute the reason to the inherent (joint) heterogeneity of source and/or target distributions, namely the singularity problem, which poses challenges for training the neural ODEs effectively. To address this issue, we propose a more general framework, termed Switched FM (SFM), that eliminates singularities via switching ODEs, as opposed to using a uniform ODE in FM. Importantly, we theoretically show that FM cannot transport between two simple distributions due to the existence and uniqueness of initial value problems of ODEs, while these limitations can be well tackled by SFM. From an orthogonal perspective, our framework can seamlessly integrate with the existing advanced techniques, such as minibatch optimal transport, to further enhance the straightness of the flow, yielding a more efficient sampling process with reduced costs. We demonstrate the effectiveness of the newly proposed SFM through several numerical examples.
A De-singularity Subgradient Approach for the Extended Weber Location Problem
Lai, Zhao-Rong, Wu, Xiaotian, Fang, Liangda, Chen, Ziliang
The extended Weber location problem is a classical optimization problem that has inspired some new works in several machine learning scenarios recently. However, most existing algorithms may get stuck due to the singularity at the data points when the power of the cost function $1\leqslant q<2$, such as the widely-used iterative Weiszfeld approach. In this paper, we establish a de-singularity subgradient approach for this problem. We also provide a complete proof of convergence which has fixed some incomplete statements of the proofs for some previous Weiszfeld algorithms. Moreover, we deduce a new theoretical result of superlinear convergence for the iteration sequence in a special case where the minimum point is a singular point. We conduct extensive experiments in a real-world machine learning scenario to show that the proposed approach solves the singularity problem, produces the same results as in the non-singularity cases, and shows a reasonable rate of linear convergence. The results also indicate that the $q$-th power case ($1
Polynomial-time Approximation Scheme for Equilibriums of Games
Sun, Hongbo, Xia, Chongkun, Yuan, Bo, Wang, Xueqian, Liang, Bin
Nash equilibrium[1] of normal-form game was proposed decades ago, yet even whether PTAS exists for it remains undecided, not to mention for equilibriums of games with dynamics. PTAS for equilibriums of games is important itself in game theory, and the confirmation of its existence may impact multi-agent reinforcement learning research. First, the existence of PTAS relates to the practicality of the amount of computational power in achieving equilibriums of large scale games. It has been proved that exactly computing a Nash equilibrium of a static game is in PPAD-hard class of complexity[2]. Ignoring the possibility that PPAD itself is of polynomial-time[3], PTAS describes methods that approximately compute Nash equilibriums efficiently. Second, the confirmation of previously unknown existence of PTAS for games implies possibility to fundamentally solve the problems of non-stationarity in training and curse of dimensionality[4] in multi-agent reinforcement learning at the same time. Both the two problems are related to the absence of PTAS for equilibriums of games. Non-stationarity in training relates to the fact that existing polynomial-time methods lack convergence guarantee to equilibriums, and curse of dimensionality relates to the fact that methods with convergence guarantee lack polynomial-time complexity.
Singularity Distance Computations for 3-RPR Manipulators Using Extrinsic Metrics
Kapilavai, Aditya, Nawratil, Georg
It is well-known that parallel manipulators are prone to singularities. However, there is still a lack of distance evaluation functions, referred to as metrics, for computing the distance between two 3-RPR configurations. The proposed extrinsic metrics take the combinatorial structure of the manipulator into account as well as different design options. Utilizing these extrinsic metrics, we formulate constrained optimization problems. These problems are aimed at identifying the closest singular configurations for a given non-singular configuration. The solution to the associated system of polynomial equations relies on algorithms from numerical algebraic geometry implemented in the software package \texttt{Bertini}. Furthermore, we have developed a computational pipeline for determining the distance to singularity during a one-parametric motion of the manipulator. To facilitate these computations for the user, an open-source interface is developed between software packages \texttt{Maple}, \texttt{Bertini}, and \texttt{Paramotopy}. The effectiveness of the presented approach is demonstrated based on numerical examples and compared with existing indices evaluating the singularity closeness.
Machine Learned Calabi-Yau Metrics and Curvature
Berglund, Per, Butbaia, Giorgi, Hübsch, Tristan, Jejjala, Vishnu, Peña, Damián Mayorga, Mishra, Challenger, Tan, Justin
Finding Ricci-flat (Calabi-Yau) metrics is a long standing problem in geometry with deep implications for string theory and phenomenology. A new attack on this problem uses neural networks to engineer approximations to the Calabi-Yau metric within a given K\"ahler class. In this paper we investigate numerical Ricci-flat metrics over smooth and singular K3 surfaces and Calabi-Yau threefolds. Using these Ricci-flat metric approximations for the Cefal\'u family of quartic twofolds and the Dwork family of quintic threefolds, we study characteristic forms on these geometries. We observe that the numerical stability of the numerically computed topological characteristic is heavily influenced by the choice of the neural network model, in particular, we briefly discuss a different neural network model, namely Spectral networks, which correctly approximate the topological characteristic of a Calabi-Yau. Using persistent homology, we show that high curvature regions of the manifolds form clusters near the singular points. For our neural network approximations, we observe a Bogomolov--Yau type inequality $3c_2 \geq c_1^2$ and observe an identity when our geometries have isolated $A_1$ type singularities. We sketch a proof that $\chi(X~\smallsetminus~\mathrm{Sing}\,{X}) + 2~|\mathrm{Sing}\,{X}| = 24$ also holds for our numerical approximations.