cone
- North America > United States (0.67)
- Europe > France (0.28)
- Asia > Middle East > Republic of Türkiye (0.14)
- (45 more...)
- Law (0.93)
- Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.67)
- Government > Military (0.67)
- Government > Regional Government > North America Government > United States Government (0.46)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Temporal Reasoning (0.51)
- Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.47)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (0.42)
- North America > United States > California (0.04)
- Asia > China > Hong Kong (0.04)
- North America > United States > California (0.04)
- Asia > China > Hong Kong (0.04)
Conic Descent and its Application to Memory-efficient Optimization over Positive Semidefinite Matrices
We present an extension of the conditional gradient method to problems whose feasible sets are convex cones. We provide a convergence analysis for the method and for variants with nonconvex objectives, and we extend the analysis to practical cases with effective line search strategies. For the specific case of the positive semidefinite cone, we present a memory-efficient version based on randomized matrix sketches and advocate a heuristic greedy step that greatly improves its practical performance. Numerical results on phase retrieval and matrix completion problems indicate that our method can offer substantial advantages over traditional conditional gradient and Burer-Monteiro approaches.
Cone-Constrained Principal Component Analysis
Estimating a vector from noisy quadratic observations is a task that arises naturally in many contexts, from dimensionality reduction, to synchronization and phase retrieval problems. It is often the case that additional information is available about the unknown vector (for instance, sparsity, sign or magnitude of its entries). Many authors propose non-convex quadratic optimization problems that aim at exploiting optimally this information. However, solving these problems is typically NP-hard. We consider a simple model for noisy quadratic observation of an unknown vector $\bvz$.
ConE: Cone Embeddings for Multi-Hop Reasoning over Knowledge Graphs
Query embedding (QE)---which aims to embed entities and first-order logical (FOL) queries in low-dimensional spaces---has shown great power in multi-hop reasoning over knowledge graphs. Recently, embedding entities and queries with geometric shapes becomes a promising direction, as geometric shapes can naturally represent answer sets of queries and logical relationships among them. However, existing geometry-based models have difficulty in modeling queries with negation, which significantly limits their applicability. To address this challenge, we propose a novel query embedding model, namely \textbf{Con}e \textbf{E}mbeddings (ConE), which is the first geometry-based QE model that can handle all the FOL operations, including conjunction, disjunction, and negation. Specifically, ConE represents entities and queries as Cartesian products of two-dimensional cones, where the intersection and union of cones naturally model the conjunction and disjunction operations. By further noticing that the closure of complement of cones remains cones, we design geometric complement operators in the embedding space for the negation operations. Experiments demonstrate that ConE significantly outperforms existing state-of-the-art methods on benchmark datasets.
Convergence Rates of Stochastic Gradient Descent under Infinite Noise Variance
Recent studies have provided both empirical and theoretical evidence illustrating that heavy tails can emerge in stochastic gradient descent (SGD) in various scenarios. Such heavy tails potentially result in iterates with diverging variance, which hinders the use of conventional convergence analysis techniques that rely on the existence of the second-order moments. In this paper, we provide convergence guarantees for SGD under a state-dependent and heavy-tailed noise with a potentially infinite variance, for a class of strongly convex objectives. In the case where the $p$-th moment of the noise exists for some $p\in [1,2)$, we first identify a condition on the Hessian, coined `$p$-positive (semi-)definiteness', that leads to an interesting interpolation between the positive semi-definite cone ($p=2$) and the cone of diagonally dominant matrices with non-negative diagonal entries ($p=1$). Under this condition, we provide a convergence rate for the distance to the global optimum in $L^p$. Furthermore, we provide a generalized central limit theorem, which shows that the properly scaled Polyak-Ruppert averaging converges weakly to a multivariate $\alpha$-stable random vector.Our results indicate that even under heavy-tailed noise with infinite variance, SGD can converge to the global optimum without necessitating any modification neither to the loss function nor to the algorithm itself, as typically required in robust statistics.We demonstrate the implications of our resultsover misspecified models, in the presence of heavy-tailed data.
Variational analysis of determinantal varieties
Yang, Yan, Gao, Bin, Yuan, Ya-xiang
Determinantal varieties -- the sets of bounded-rank matrices or tensors -- have attracted growing interest in low-rank optimization. The tangent cone to low-rank sets is widely studied and underpins a range of geometric methods. The second-order geometry, which encodes curvature information, is more intricate. In this work, we develop a unified framework to derive explicit formulas for both first- and second-order tangent sets to various low-rank sets, including low-rank matrices, tensors, symmetric matrices, and positive semidefinite matrices. The framework also accommodates the intersection of a low-rank set and another set satisfying mild assumptions, thereby yielding a tangent intersection rule. Through the lens of tangent sets, we establish a necessary and sufficient condition under which a nonsmooth problem and its smooth parameterization share equivalent second-order stationary points. Moreover, we exploit tangent sets to characterize optimality conditions for low-rank optimization and prove that verifying second-order optimality is NP-hard. In a separate line of analysis, we investigate variational geometry of the graph of the normal cone to matrix varieties, deriving the explicit Bouligand tangent cone, Fréchet and Mordukhovich normal cones to the graph. These results are further applied to develop optimality conditions for low-rank bilevel programs.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > China > Beijing > Beijing (0.04)
Nonnegative Matrix Factorization through Cone Collapse
Nguyen, Manh, Pimentel-Alarcón, Daniel
Nonnegative matrix factorization (NMF) is a widely used tool for learning parts-based, low-dimensional representations of nonnegative data, with applications in vision, text, and bioinformatics. In clustering applications, orthogonal NMF (ONMF) variants further impose (approximate) orthogonality on the representation matrix so that its rows behave like soft cluster indicators. Existing algorithms, however, are typically derived from optimization viewpoints and do not explicitly exploit the conic geometry induced by NMF: data points lie in a convex cone whose extreme rays encode fundamental directions or "topics". In this work we revisit NMF from this geometric perspective and propose Cone Collapse, an algorithm that starts from the full nonnegative orthant and iteratively shrinks it toward the minimal cone generated by the data. We prove that, under mild assumptions on the data, Cone Collapse terminates in finitely many steps and recovers the minimal generating cone of $\mathbf{X}^\top$ . Building on this basis, we then derive a cone-aware orthogonal NMF model (CC-NMF) by applying uni-orthogonal NMF to the recovered extreme rays. Across 16 benchmark gene-expression, text, and image datasets, CC-NMF consistently matches or outperforms strong NMF baselines-including multiplicative updates, ANLS, projective NMF, ONMF, and sparse NMF-in terms of clustering purity. These results demonstrate that explicitly recovering the data cone can yield both theoretically grounded and empirically strong NMF-based clustering methods.
- North America > United States > Wisconsin > Dane County > Madison (0.04)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- North America > United States > Georgia > Fulton County > Atlanta (0.04)
- (4 more...)