tangent
Variational analysis of determinantal varieties
Yang, Yan, Gao, Bin, Yuan, Ya-xiang
Determinantal varieties -- the sets of bounded-rank matrices or tensors -- have attracted growing interest in low-rank optimization. The tangent cone to low-rank sets is widely studied and underpins a range of geometric methods. The second-order geometry, which encodes curvature information, is more intricate. In this work, we develop a unified framework to derive explicit formulas for both first- and second-order tangent sets to various low-rank sets, including low-rank matrices, tensors, symmetric matrices, and positive semidefinite matrices. The framework also accommodates the intersection of a low-rank set and another set satisfying mild assumptions, thereby yielding a tangent intersection rule. Through the lens of tangent sets, we establish a necessary and sufficient condition under which a nonsmooth problem and its smooth parameterization share equivalent second-order stationary points. Moreover, we exploit tangent sets to characterize optimality conditions for low-rank optimization and prove that verifying second-order optimality is NP-hard. In a separate line of analysis, we investigate variational geometry of the graph of the normal cone to matrix varieties, deriving the explicit Bouligand tangent cone, Fréchet and Mordukhovich normal cones to the graph. These results are further applied to develop optimality conditions for low-rank bilevel programs.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > China > Beijing > Beijing (0.04)
An Investigation of Robustness of LLMs in Mathematical Reasoning: Benchmarking with Mathematically-Equivalent Transformation of Advanced Mathematical Problems
Hao, Yuren, Wan, Xiang, Zhai, ChengXiang
In this paper, we introduce a systematic framework beyond conventional method to assess LLMs' mathematical-reasoning robustness by stress-testing them on advanced math problems that are mathematically equivalent but with linguistic and parametric variation. These transformations allow us to measure the sensitivity of LLMs to non-mathematical perturbations, thereby enabling a more accurate evaluation of their mathematical reasoning capabilities. Using this new evaluation methodology, we created PutnamGAP, a new benchmark dataset with multiple mathematically-equivalent variations of competition-level math problems. With the new dataset, we evaluate multiple families of representative LLMs and examine their robustness. Across 18 commercial and open-source models we observe sharp performance degradation on the variants. OpenAI's flagship reasoning model, O3, scores 51.5% on the originals but drops by 4.7 percentage points on surface-renaming variants, and by 12.9 percentage points on parametric variants, while smaller models fare far worse. Overall, the results show that the proposed new evaluation methodology is effective for deepening our understanding of the robustness of LLMs and generating new insights for further improving their mathematical reasoning capabilities.
- North America > United States > Illinois > Champaign County > Urbana (0.14)
- Europe > Austria > Vienna (0.14)
- North America > United States > District of Columbia > Washington (0.04)
- (6 more...)
Gold-Medal-Level Olympiad Geometry Solving with Efficient Heuristic Auxiliary Constructions
Duan, Boyan, Liang, Xiao, Lu, Shuai, Wang, Yaoxiang, Shen, Yelong, Chang, Kai-Wei, Wu, Ying Nian, Yang, Mao, Chen, Weizhu, Gong, Yeyun
Automated theorem proving in Euclidean geometry, particularly for International Mathematical Olympiad (IMO) level problems, remains a major challenge and an important research focus in Artificial Intelligence. In this paper, we present a highly efficient method for geometry theorem proving that runs entirely on CPUs without relying on neural network-based inference. Our initial study shows that a simple random strategy for adding auxiliary points can achieve silver-medal level human performance on IMO. Building on this, we propose HAGeo, a Heuristic-based method for adding Auxiliary constructions in Geometric deduction that solves 28 of 30 problems on the IMO-30 benchmark, achieving gold-medal level performance and surpassing AlphaGeometry, a competitive neural network-based approach, by a notable margin. To evaluate our method and existing approaches more comprehensively, we further construct HAGeo-409, a benchmark consisting of 409 geometry problems with human-assessed difficulty levels. Compared with the widely used IMO-30, our benchmark poses greater challenges and provides a more precise evaluation, setting a higher bar for geometry theorem proving.
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- North America > Canada > British Columbia (0.04)
- Europe > Switzerland > Zürich > Zürich (0.04)
- (2 more...)
- North America > United States > Utah > Salt Lake County > Salt Lake City (0.04)
- North America > United States > California > Los Angeles County > Long Beach (0.04)
- Asia > China > Beijing > Beijing (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.86)
- North America > United States (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- North America > United States (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- Asia > Singapore (0.04)
- North America > United States > California > San Diego County > San Diego (0.04)
- Information Technology (0.46)
- Government (0.46)
- Banking & Finance (0.45)
A geometrical approach to solve the proximity of a point to an axisymmetric quadric in space
Patra, Bibekananda, Kolte, Aditya Mahesh, Bandyopadhyay, Sandipan
This paper presents the classification of a general quadric into an axisymmetric quadric (AQ) and the solution to the problem of the proximity of a given point to an AQ. The problem of proximity in $R^3$ is reduced to the same in $R^2$, which is not found in the literature. A new method to solve the problem in $R^2$ is used based on the geometrical properties of the conics, such as sub-normal, length of the semi-major axis, eccentricity, slope and radius. Furthermore, the problem in $R^2$ is categorised into two and three more sub-cases for parabola and ellipse/hyperbola, respectively, depending on the location of the point, which is a novel approach as per the authors' knowledge. The proposed method is suitable for implementation in a common programming language, such as C and proved to be faster than a commercial library, namely, Bullet.
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Illinois > Champaign County > Champaign (0.04)
- North America > United States > Florida > Palm Beach County > Boca Raton (0.04)
- (3 more...)
- Research Report (0.70)
- Instructional Material (0.46)
- Information Technology > Artificial Intelligence > Robots (0.46)
- Information Technology > Software > Programming Languages (0.34)
- Asia > Singapore (0.04)
- North America > United States > California > San Diego County > San Diego (0.04)
- Information Technology (0.46)
- Government (0.46)
- Banking & Finance (0.45)
Align Your Tangent: Training Better Consistency Models via Manifold-Aligned Tangents
Kim, Beomsu, Cha, Byunghee, Ye, Jong Chul
With diffusion and flow matching models achieving state-of-the-art generating performance, the interest of the community now turned to reducing the inference time without sacrificing sample quality. Consistency Models (CMs), which are trained to be consistent on diffusion or probability flow ordinary differential equation (PF-ODE) trajectories, enable one or two-step flow or diffusion sampling. However, CMs typically require prolonged training with large batch sizes to obtain competitive sample quality. In this paper, we examine the training dynamics of CMs near convergence and discover that CM tangents -- CM output update directions -- are quite oscillatory, in the sense that they move parallel to the data manifold, not towards the manifold. To mitigate oscillatory tangents, we propose a new loss function, called the manifold feature distance (MFD), which provides manifold-aligned tangents that point toward the data manifold. Consequently, our method -- dubbed Align Your Tangent (AYT) -- can accelerate CM training by orders of magnitude and even out-perform the learned perceptual image patch similarity metric (LPIPS). Furthermore, we find that our loss enables training with extremely small batch sizes without compromising sample quality. Code: https://github.com/1202kbs/AYT