AITopics | legendre transform

Collaborating Authors

legendre transform

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Deep Legendre Transform

Neural Information Processing SystemsJun-20-2026, 08:41:07 GMT

We introduce a novel deep learning algorithm for computing convex conjugates of differentiable convex functions, a fundamental operation in convex analysis with various applications in different fields such as optimization, control theory, physics and economics. While traditional numerical methods suffer from the curse of dimensionality and become computationally intractable in high dimensions, more recent neural network-based approaches scale better, but have mostly been studied with the aim of solving optimal transport problems and require the solution of complicated optimization or max-min problems. Using an implicit Fenchel formulation of convex conjugation, our approach facilitates an efficient gradient-based framework for the minimization of approximation errors and, as a byproduct, also provides a posteriori estimates of the approximation accuracy. Numerical experiments demonstrate our method's ability to deliver accurate results across different high-dimensional examples. Moreover, by employing symbolic regression with Kolmogorov-Arnold networks, it is able to obtain the exact convex conjugates of specific convex functions.

artificial intelligence, legendre transform, machine learning, (17 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.86)

Add feedback

9230b34134929c69b14dc37990634122-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-10-2026, 19:41:02 GMT

compute, criterion, prop, (13 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.46)

Add feedback

A ADDITIONAL PROOFS 470 A.1 Proof of Lemma 1 (strongly convex case)

Neural Information Processing SystemsAug-17-2025, 01:03:41 GMT

The major part of the proof is adapted from Muzellec et al. [ 2021, Lemma 3.1]. In bold, the highest accuracy after being calibrated with the semi-dual. Quadratic Problem and can be numerically solved with CVXPY for instance. Ax b K, with K a fixed cone to be compiled only once. SSNB The strong convexity parameter l is chosen in { 0 .

additional proof 470, artificial intelligence, lemma 1, (13 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.46)

Add feedback

Neural Implicit Solution Formula for Efficiently Solving Hamilton-Jacobi Equations

Park, Yesom, Osher, Stanley

arXiv.org Artificial IntelligenceJan-31-2025

This paper presents an implicit solution formula for the Hamilton-Jacobi partial differential equation (HJ PDE). The formula is derived using the method of characteristics and is shown to coincide with the Hopf and Lax formulas in the case where either the Hamiltonian or the initial function is convex. It provides a simple and efficient numerical approach for computing the viscosity solution of HJ PDEs, bypassing the need for the Legendre transform of the Hamiltonian or the initial condition, and the explicit computation of individual characteristic trajectories. A deep learning-based methodology is proposed to learn this implicit solution formula, leveraging the mesh-free nature of deep learning to ensure scalability for high-dimensional problems. Building upon this framework, an algorithm is developed that approximates the characteristic curves piecewise linearly for state-dependent Hamiltonians. Extensive experimental results demonstrate that the proposed method delivers highly accurate solutions, even for nonconvex Hamiltonians, and exhibits remarkable scalability, achieving computational efficiency for problems up to 40 dimensions.

artificial intelligence, formula, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2501.19351

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > Indiana (0.04)

Genre: Research Report > New Finding (0.88)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A note on continuous-time online learning

Ying, Lexing

arXiv.org Machine LearningMay-16-2024

In online learning, the data is provided in a sequential order, and the goal of the learner is to make online decisions to minimize overall regrets. This note is concerned with continuous-time models and algorithms for several online learning problems: online linear optimization, adversarial bandit, and adversarial linear bandit. For each problem, we extend the discrete-time algorithm to the continuous-time setting and provide a concise proof of the optimal regret bound.

algorithm, bandit, legendre transform, (13 more...)

arXiv.org Machine Learning

2405.10399

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.05)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.05)
North America > United States > California > Santa Clara County > Stanford (0.04)

Genre: Research Report (0.50)

Industry: Education > Educational Setting > Online (0.97)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.87)

Add feedback

A Simple and General Duality Proof for Wasserstein Distributionally Robust Optimization

Zhang, Luhao, Yang, Jincheng, Gao, Rui

arXiv.org Machine LearningDec-31-2023

We present an elementary yet general proof of duality for Wasserstein distributionally robust optimization. The duality holds for any arbitrary Kantorovich transport cost, measurable loss function, and nominal probability distribution, provided that an interchangeability principle holds, which is equivalent to certain measurability conditions. To illustrate the broader applicability of our approach, we provide a rigorous treatment of duality results in distributionally robust Markov decision processes and distributionally robust multistage stochastic programming. Furthermore, we extend the result to other problems including infinity-Wasserstein distributionally robust optimization, risk-averse optimization, and globalized distributionally robust counterpart.

assumption, distributionally robust optimization, sup, (15 more...)

arXiv.org Machine Learning

2205.00362

Country:

North America > United States > Texas (0.04)
North America > United States > New York (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

A max-affine spline approximation of neural networks using the Legendre transform of a convex-concave representation

Perrett, Adam, Wood, Danny, Brown, Gavin

arXiv.org Artificial IntelligenceJul-16-2023

This work presents a novel algorithm for transforming a neural network into a spline representation. Unlike previous work that required convex and piecewise-affine network operators to create a max-affine spline alternate form, this work relaxes this constraint. The only constraint is that the function be bounded and possess a well-define second derivative, although this was shown experimentally to not be strictly necessary. It can also be performed over the whole network rather than on each layer independently. As in previous work, this bridges the gap between neural networks and approximation theory but also enables the visualisation of network feature maps. Mathematical proof and experimental investigation of the technique is performed with approximation error and feature maps being extracted from a range of architectures, including convolutional neural networks.

approximation, artificial intelligence, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2307.09602

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
(2 more...)

Genre: Research Report (0.50)

Industry:

Telecommunications > Networks (0.54)
Information Technology > Networks (0.54)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Non-negative matrix and tensor factorisations with a smoothed Wasserstein loss

Zhang, Stephen Y.

arXiv.org Machine LearningApr-4-2021

Non-negative matrix and tensor factorisations are a classical tool in machine learning and data science for finding low-dimensional representations of high-dimensional datasets. In applications such as imaging, datasets can often be regarded as distributions in a space with metric structure. In such a setting, a Wasserstein loss function based on optimal transportation theory is a natural choice since it incorporates knowledge about the geometry of the underlying space. We introduce a general mathematical framework for computing non-negative factorisations of matrices and tensors with respect to an optimal transport loss, and derive an efficient method for its solution using a convex dual formulation. We demonstrate the applicability of this approach with several numerical examples.

decomposition, factorisation, matrix, (16 more...)

arXiv.org Machine Learning

2104.01708

Country:

North America > Canada > British Columbia (0.04)
Africa > Senegal > Kolda Region > Kolda (0.04)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Towards a mathematical theory of trajectory inference

Lavenant, Hugo, Zhang, Stephen, Kim, Young-Heon, Schiebinger, Geoffrey

arXiv.org Machine LearningFeb-18-2021

We devise a theoretical framework and a numerical method to infer trajectories of a stochastic process from snapshots of its temporal marginals. This problem arises in the analysis of single cell RNA-sequencing data, which provide high dimensional measurements of cell states but cannot track the trajectories of the cells over time. We prove that for a class of stochastic processes it is possible to recover the ground truth trajectories from limited samples of the temporal marginals at each time-point, and provide an efficient algorithm to do so in practice. The method we develop, Global Waddington-OT (gWOT), boils down to a smooth convex optimization problem posed globally over all time-points involving entropy-regularized optimal transport. We demonstrate that this problem can be solved efficiently in practice and yields good reconstructions, as we show on several synthetic and real datasets.

optimal transport, proposition 4, theorem 4, (17 more...)

arXiv.org Machine Learning

2102.09204

Country:

North America > United States > North Carolina (0.04)
North America > Canada > British Columbia (0.04)
Europe > Slovenia > Drava > Municipality of Benedikt > Benedikt (0.04)
(3 more...)

Genre: Research Report (0.81)

Industry:

Health & Medicine > Therapeutic Area > Oncology (0.45)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.45)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.87)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Add feedback

Additivity of Information in Multilayer Networks via Additive Gaussian Noise Transforms

Reeves, Galen

arXiv.org Machine LearningOct-12-2017

Multilayer (or deep) networks are powerful probabilistic models based on multiple stages of a linear transform followed by a non-linear (possibly random) function. In general, the linear transforms are defined by matrices and the non-linear functions are defined by information channels. These models have gained great popularity due to their ability to characterize complex probabilistic relationships arising in a wide variety of inference problems. The contribution of this paper is a new method for analyzing the fundamental limits of statistical inference in settings where the model is known. The validity of our method can be established in a number of settings and is conjectured to hold more generally. A key assumption made throughout is that the matrices are drawn randomly from orthogonally invariant distributions. Our method yields explicit formulas for 1) the mutual information; 2) the minimum mean-squared error (MMSE); 3) the existence and locations of certain phase-transitions with respect to the problem parameters; and 4) the stationary points for the state evolution of approximate message passing algorithms. When applied to the special case of models with multivariate Gaussian channels our method is rigorous and has close connections to free probability theory for random matrices. When applied to the general case of non-Gaussian channels, our method provides a simple alternative to the replica method from statistical physics. A key observation is that the combined effects of the individual components in the model (namely the matrices and the channels) are additive when viewed in a certain transform domain.

artificial intelligence, matrix, mmse function, (14 more...)

arXiv.org Machine Learning

1710.0458

Country: Europe > Germany (0.14)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.34)

Add feedback