convexity
Horospherical Depth and Busemann Median on Hadamard Manifolds
Jiang, Yangdi, Chang, Xiaotian, Mostajeran, Cyrus
\We introduce the horospherical depth, an intrinsic notion of statistical depth on Hadamard manifolds, and define the Busemann median as the set of its maximizers. The construction exploits the fact that the linear functionals appearing in Tukey's half-space depth are themselves limits of renormalized distance functions; on a Hadamard manifold the same limiting procedure produces Busemann functions, whose sublevel sets are horoballs, the intrinsic replacements for halfspaces. The resulting depth is parametrized by the visual boundary, is isometry-equivariant, and requires neither tangent-space linearization nor a chosen base point.For arbitrary Hadamard manifolds, we prove that the depth regions are nested and geodesically convex, that a centerpoint of depth at least $1/(d+1)$ exists, and hence that the Busemann median exists for every Borel probability measure. Under strictly negative sectional curvature and mild regularity assumptions, the depth is strictly quasi-concave and the median is unique. We also establish robustness: the depth is stable under total-variation perturbations, and under contamination escaping to infinity the limiting median depends on the escape direction but not on how far the contaminating mass has moved along the geodesic ray, in contrast with the Fréchet mean. Finally, we establish uniform consistency of the sample depth and convergence of sample depth regions and sample Busemann medians; on symmetric spaces of noncompact type, the argument proceeds through a VC analysis of upper horospherical halfspaces, while on general Hadamard manifolds it follows from a compactness argument under a mild non-atomicity assumption.
A short proof of near-linear convergence of adaptive gradient descent under fourth-order growth and convexity
Davis, Damek, Drusvyatskiy, Dmitriy
Davis, Drusvyatskiy, and Jiang showed that gradient descent with an adaptive stepsize converges locally at a nearly-linear rate for smooth functions that grow at least quartically away from their minimizers. The argument is intricate, relying on monitoring the performance of the algorithm relative to a certain manifold of slow growth -- called the ravine. In this work, we provide a direct Lyapunov-based argument that bypasses these difficulties when the objective is in addition convex and a has a unique minimizer. As a byproduct of the argument, we obtain a more adaptive variant than the original algorithm with encouraging numerical performance.
- North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.14)
- North America > United States > New York (0.04)
- North America > United States > California > San Diego County > San Diego (0.04)
- (3 more...)
Random Coordinate Descent on the Wasserstein Space of Probability Measures
Optimization over the space of probability measures endowed with the Wasserstein-2 geometry is central to modern machine learning and mean-field modeling. However, traditional methods relying on full Wasserstein gradients often suffer from high computational overhead in high-dimensional or ill-conditioned settings. We propose a randomized coordinate descent framework specifically designed for the Wasserstein manifold, introducing both Random Wasserstein Coordinate Descent (RWCD) and Random Wasserstein Coordinate Proximal{-Gradient} (RWCP) for composite objectives. By exploiting coordinate-wise structures, our methods adapt to anisotropic objective landscapes where full-gradient approaches typically struggle. We provide a rigorous convergence analysis across various landscape geometries, establishing guarantees under non-convex, Polyak-Łojasiewicz, and geodesically convex conditions. Our theoretical results mirror the classic convergence properties found in Euclidean space, revealing a compelling symmetry between coordinate descent on vectors and on probability measures. The developed techniques are inherently adaptive to the Wasserstein geometry and offer a robust analytical template that can be extended to other optimization solvers within the space of measures. Numerical experiments on ill-conditioned energies demonstrate that our framework offers significant speedups over conventional full-gradient methods.
- North America > United States > Wisconsin > Dane County > Madison (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > Middle East > Jordan (0.04)
Unveiling Hidden Convexity in Deep Learning: a Sparse Signal Processing Perspective
Deep neural networks (DNNs), particularly those using Rectified Linear Unit (ReLU) activation functions, have achieved remarkable success across diverse machine learning tasks, including image recognition, audio processing, and language modeling. Despite this success, the non-convex nature of DNN loss functions complicates optimization and limits theoretical understanding. In this paper, we highlight how recently developed convex equivalences of ReLU NNs and their connections to sparse signal processing models can address the challenges of training and understanding NNs. Recent research has uncovered several hidden convexities in the loss landscapes of certain NN architectures, notably two-layer ReLU networks and other deeper or varied architectures. This paper seeks to provide an accessible and educational overview that bridges recent advances in the mathematics of deep learning with traditional signal processing, encouraging broader signal processing applications.
- North America > United States > New York (0.04)
- Europe > Switzerland > Zürich > Zürich (0.04)
- Europe > Russia (0.04)
- (2 more...)
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > Canada (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.50)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.40)
- Asia > Middle East > Jordan (0.04)
- Europe > France > Île-de-France > Paris > Paris (0.04)
Fast Rates in Stochastic Online Convex Optimization by Exploiting the Curvature of Feasible Sets
In this work, we explore online convex optimization (OCO) and introduce a new condition and analysis that provides fast rates by exploiting the curvature of feasible sets. In online linear optimization, it is known that if the average gradient of loss functions exceeds a certain threshold, the curvature of feasible sets can be exploited by the follow-the-leader (FTL) algorithm to achieve a logarithmic regret. This study reveals that algorithms adaptive to the curvature of loss functions can also leverage the curvature of feasible sets. In particular, we first prove that if an optimal decision is on the boundary of a feasible set and the gradient of an underlying loss function is non-zero, then the algorithm achieves a regret bound of O ( ρ ln T) in stochastic environments. Here, ρ > 0 is the radius of the smallest sphere that includes the optimal decision and encloses the feasible set. Our approach, unlike existing ones, can work directly with convex loss functions, exploiting the curvature of loss functions simultaneously, and can achieve the logarithmic regret only with a local property of feasible sets. Additionally, the algorithm achieves an O ( T) regret even in adversarial environments, in which FTL suffers an Ω( T) regret, and achieves an O ( ρ ln T + Cρ ln T) regret in corrupted stochastic environments with corruption level C .
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Russia (0.04)
- Asia > Russia (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > United States > California (0.04)
- Asia > Middle East > Jordan (0.04)
- Asia > Middle East > Israel (0.04)
- North America > Canada > Ontario > Toronto (0.14)
- North America > United States (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- (5 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Asia > Middle East > Jordan (0.04)
- North America > United States > Virginia (0.04)
- North America > United States > Texas > Travis County > Austin (0.04)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)