AITopics | tangent

Collaborating Authors

tangent

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

On Variance Reduction in Learning Mean Flows

Lu, Juanwu, Wang, Ziran

arXiv.org Machine LearningMay-12-2026

One-step generative modeling has emerged as a leading approach to amortize the inference cost of diffusion and flow-matching models. Among distillation-free methods, MeanFlow training is notoriously unstable, with non-decreasing loss and unbounded gradient variance. In this work, we establish a theory that attributes this pathology to a misuse of the conditional velocity field: it plays two distinct statistical roles in the loss, both as an unbiased regression target and as a Monte Carlo control variate inside a Jacobi-vector product, with the original loss assigning the wrong coefficient to the latter. We derive the optimal coefficient in closed form, and show that a family of fixes in concurrent works corresponds to different practical realizations of the same optimum. A controlled sweep of this coefficient on two-dimensional benchmarks and on a latent Diffusion Transformer recovers the predicted bias-variance ordering. The optimal coefficient yields up to a %54 improvement in sample quality on two-dimensional benchmarks and a monotone FID trend at every matched-step DiT checkpoint. Crucially, the same DiT measurement also reveals a quantitative FID-MSE landscape mismatch: although gradient variance is minimized at an interior coefficient value, the coefficient that minimizes FID prefers the direct use of conditional velocity.

artificial intelligence, coefficient, machine learning, (17 more...)

arXiv.org Machine Learning

2605.09235

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Robust Automatic Differentiation of Square-Root Kalman Filters via Gramian Differentials

Corenflos, Adrien

arXiv.org Machine LearningMar-17-2026

Square-root Kalman filters propagate state covariances in Cholesky-factor form for numerical stability, and are a natural target for gradient-based parameter learning in state-space models. Their core operation, triangularization of a matrix $M \in \mathbb{R}^{n \times m}$, is computed via a QR decomposition in practice, but naively differentiating through it causes two problems: the semi-orthogonal factor is non-unique when $m > n$, yielding undefined gradients; and the standard Jacobian formula involves inverses, which diverges when $M$ is rank-deficient. Both are resolved by the observation that all filter outputs relevant to learning depend on the input matrix only through the Gramian $MM^\top$, so the composite loss is smooth in $M$ even where the triangularization is not. We derive a closed-form chain-rule directly from the differential of this Gramian identity, prove it exact for the Kalman log-marginal likelihood and filtered moments, and extend it to rank-deficient inputs via a two-component decomposition: a column-space term based on the Moore--Penrose pseudoinverse, and a null-space correction for perturbations outside the column space of $M$.

artificial intelligence, machine learning, tangent, (17 more...)

arXiv.org Machine Learning

2603.13559

Country:

North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
North America > United States > New York (0.04)
Europe > United Kingdom > England > West Midlands > Coventry (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Stochastic Taylor Derivative Estimator: Efficient amortization for arbitrary differential operators

Neural Information Processing SystemsFeb-18-2026, 09:38:00 GMT

Our code is available at https://github.com/sail-sg/stde

artificial intelligence, machine learning, operator, (20 more...)

Neural Information Processing Systems

Country:

Asia > Singapore (0.04)
North America > United States > California > San Diego County > San Diego (0.04)

Genre: Research Report > Experimental Study (0.93)

Industry:

Information Technology (0.46)
Government (0.46)
Banking & Finance (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Tangent: Automatic differentiation using source-code transformation for dynamically typed array programming

Bart van Merrienboer, Dan Moldovan, Alexander Wiltschko

Neural Information Processing SystemsFeb-13-2026, 07:01:57 GMT

Manyapplicationsinmachine learning relyongradient-based optimization, oratleasttheefficient calculation of derivatives of models expressed as computer programs. Researchers have a wide variety of tools from which they can choose, particularly if they are using the Python language [21,16,24,2,1].

artificial intelligence, machine learning, tangent, (17 more...)

Neural Information Processing Systems

Country:

North America > United States (0.04)
North America > Canada > Quebec > Montreal (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)

Add feedback

Variational analysis of determinantal varieties

Yang, Yan, Gao, Bin, Yuan, Ya-xiang

arXiv.org Artificial IntelligenceDec-12-2025

Determinantal varieties -- the sets of bounded-rank matrices or tensors -- have attracted growing interest in low-rank optimization. The tangent cone to low-rank sets is widely studied and underpins a range of geometric methods. The second-order geometry, which encodes curvature information, is more intricate. In this work, we develop a unified framework to derive explicit formulas for both first- and second-order tangent sets to various low-rank sets, including low-rank matrices, tensors, symmetric matrices, and positive semidefinite matrices. The framework also accommodates the intersection of a low-rank set and another set satisfying mild assumptions, thereby yielding a tangent intersection rule. Through the lens of tangent sets, we establish a necessary and sufficient condition under which a nonsmooth problem and its smooth parameterization share equivalent second-order stationary points. Moreover, we exploit tangent sets to characterize optimality conditions for low-rank optimization and prove that verifying second-order optimality is NP-hard. In a separate line of analysis, we investigate variational geometry of the graph of the normal cone to matrix varieties, deriving the explicit Bouligand tangent cone, Fréchet and Mordukhovich normal cones to the graph. These results are further applied to develop optimality conditions for low-rank bilevel programs.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2511.22613

Genre: Research Report (0.81)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

An Investigation of Robustness of LLMs in Mathematical Reasoning: Benchmarking with Mathematically-Equivalent Transformation of Advanced Mathematical Problems

Hao, Yuren, Wan, Xiang, Zhai, ChengXiang

arXiv.org Artificial IntelligenceDec-5-2025

In this paper, we introduce a systematic framework beyond conventional method to assess LLMs' mathematical-reasoning robustness by stress-testing them on advanced math problems that are mathematically equivalent but with linguistic and parametric variation. These transformations allow us to measure the sensitivity of LLMs to non-mathematical perturbations, thereby enabling a more accurate evaluation of their mathematical reasoning capabilities. Using this new evaluation methodology, we created PutnamGAP, a new benchmark dataset with multiple mathematically-equivalent variations of competition-level math problems. With the new dataset, we evaluate multiple families of representative LLMs and examine their robustness. Across 18 commercial and open-source models we observe sharp performance degradation on the variants. OpenAI's flagship reasoning model, O3, scores 51.5% on the originals but drops by 4.7 percentage points on surface-renaming variants, and by 12.9 percentage points on parametric variants, while smaller models fare far worse. Overall, the results show that the proposed new evaluation methodology is effective for deepening our understanding of the robustness of LLMs and generating new insights for further improving their mathematical reasoning capabilities.

large language model, machine learning, variant, (22 more...)

arXiv.org Artificial Intelligence

2508.08833

Country:

Europe (0.67)
North America > United States > Illinois (0.28)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback

Gold-Medal-Level Olympiad Geometry Solving with Efficient Heuristic Auxiliary Constructions

Duan, Boyan, Liang, Xiao, Lu, Shuai, Wang, Yaoxiang, Shen, Yelong, Chang, Kai-Wei, Wu, Ying Nian, Yang, Mao, Chen, Weizhu, Gong, Yeyun

arXiv.org Artificial IntelligenceDec-2-2025

Automated theorem proving in Euclidean geometry, particularly for International Mathematical Olympiad (IMO) level problems, remains a major challenge and an important research focus in Artificial Intelligence. In this paper, we present a highly efficient method for geometry theorem proving that runs entirely on CPUs without relying on neural network-based inference. Our initial study shows that a simple random strategy for adding auxiliary points can achieve silver-medal level human performance on IMO. Building on this, we propose HAGeo, a Heuristic-based method for adding Auxiliary constructions in Geometric deduction that solves 28 of 30 problems on the IMO-30 benchmark, achieving gold-medal level performance and surpassing AlphaGeometry, a competitive neural network-based approach, by a notable margin. To evaluate our method and existing approaches more comprehensively, we further construct HAGeo-409, a benchmark consisting of 409 geometry problems with human-assessed difficulty levels. Compared with the widely used IMO-30, our benchmark poses greater challenges and provides a more precise evaluation, setting a higher bar for geometry theorem proving.

logic & formal reasoning, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2512.00097

Country:

North America > United States > California (0.28)
Europe (0.28)

Genre: Research Report > New Finding (0.34)

Industry: Leisure & Entertainment > Sports (0.81)

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Semi-supervised Learning with GANs: Manifold Invariance with Improved Inference

Abhishek Kumar, Prasanna Sattigeri, Tom Fletcher

Neural Information Processing SystemsNov-21-2025, 13:03:20 GMT

In most applications of interest (e.g., modeling images), we have

artificial intelligence, machine learning, semi-supervised learning, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Utah > Salt Lake County > Salt Lake City (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
Asia > China > Beijing > Beijing (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.86)

Add feedback

Tangent: Automatic differentiation using source-code transformation for dynamically typed array programming

Bart van Merrienboer, Dan Moldovan, Alexander Wiltschko

Neural Information Processing SystemsNov-21-2025, 03:52:28 GMT

Tangent is the first AD framework using SCT in a dynamically typed language.

artificial intelligence, machine learning, programming language, (20 more...)

Neural Information Processing Systems

Country:

North America > United States (0.04)
North America > Canada > Quebec > Montreal (0.04)

Technology:

Information Technology > Software > Programming Languages (1.00)
Information Technology > Software Engineering (0.84)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

A geometrical approach to solve the proximity of a point to an axisymmetric quadric in space

Patra, Bibekananda, Kolte, Aditya Mahesh, Bandyopadhyay, Sandipan

arXiv.org Artificial IntelligenceOct-13-2025

This paper presents the classification of a general quadric into an axisymmetric quadric (AQ) and the solution to the problem of the proximity of a given point to an AQ. The problem of proximity in $R^3$ is reduced to the same in $R^2$, which is not found in the literature. A new method to solve the problem in $R^2$ is used based on the geometrical properties of the conics, such as sub-normal, length of the semi-major axis, eccentricity, slope and radius. Furthermore, the problem in $R^2$ is categorised into two and three more sub-cases for parabola and ellipse/hyperbola, respectively, depending on the location of the point, which is a novel approach as per the authors' knowledge. The proposed method is suitable for implementation in a common programming language, such as C and proved to be faster than a commercial library, namely, Bullet.

artificial intelligence, equation, programming language, (18 more...)

arXiv.org Artificial Intelligence

2510.08973

Country: North America > United States (1.00)

Genre:

Research Report (0.70)
Instructional Material (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (0.46)
Information Technology > Software > Programming Languages (0.34)

Add feedback