AITopics | computed efficiently

193002e668758ea9762904da1a22337c-Supplemental.pdf

Neural Information Processing SystemsOct-9-2025, 13:23:42 GMT

artificial intelligence, control variate, machine learning, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.47)

Add feedback

Bisimulation Metrics are Optimal Transport Distances, and Can be Computed Efficiently

Neural Information Processing SystemsMay-27-2025, 21:11:08 GMT

We propose a new framework for formulating optimal transport distances between Markov chains. Previously known formulations studied couplings between the entire joint distribution induced by the chains, and derived solutions via a reduction to dynamic programming (DP) in an appropriately defined Markov decision process. This formulation has, however, not led to particularly efficient algorithms so far, since computing the associated DP operators requires fully solving a static optimal transport problem, and these operators need to be applied numerous times during the overall optimization process. In this work, we develop an alternative perspective by considering couplings between a flattened'' version of the joint distributions that we call discounted occupancy couplings, and show that calculating optimal transport distances in the full space of joint distributions can be equivalently formulated as solving a linear program (LP) in this reduced space. This LP formulation formulation allows us to port several algorithmic ideas from other areas of optimal transport theory.

bisimulation metric, computed efficiently, optimal transport distance, (4 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.77)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.63)

Add feedback

Injective Flows for parametric hypersurfaces

Negri, Marcello Massimo, Aellen, Jonathan, Roth, Volker

arXiv.org Machine LearningJun-13-2024

Normalizing Flows (NFs) are powerful and efficient models for density estimation. When modeling densities on manifolds, NFs can be generalized to injective flows but the Jacobian determinant becomes computationally prohibitive. Current approaches either consider bounds on the log-likelihood or rely on some approximations of the Jacobian determinant. In contrast, we propose injective flows for parametric hypersurfaces and show that for such manifolds we can compute the Jacobian determinant exactly and efficiently, with the same cost as NFs. Furthermore, we show that for the subclass of star-like manifolds we can extend the proposed framework to always allow for a Cartesian representation of the density. We showcase the relevance of modeling densities on hypersurfaces in two settings. Firstly, we introduce a novel Objective Bayesian approach to penalized likelihood models by interpreting level-sets of the penalty as star-like manifolds. Secondly, we consider Bayesian mixture models and introduce a general method for variational inference by defining the posterior of mixture weights on the probability simplex.

injective flow, jacobian determinant, manifold, (13 more...)

arXiv.org Machine Learning

2406.09116

Country:

Europe > Switzerland > Basel-City > Basel (0.04)
North America > United States > California (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Add feedback

Learning Efficient Random Maximum A-Posteriori Predictors with Non-Decomposable Loss Functions

Neural Information Processing SystemsMar-13-2024, 17:52:02 GMT

In this work we develop efficient methods for learning random MAP predictors for structured label problems. In particular, we construct posterior distributions over perturbations that can be adjusted via stochastic gradient methods. We show that any smooth posterior distribution would suffice to define a smooth PAC-Bayesian risk bound suitable for gradient methods. In addition, we relate the posterior distributions to computational properties of the MAP predictors. We suggest multiplicative posteriors to learn super-modular potential functions that accompany specialized MAP predictors such as graph-cuts. We also describe label-augmented posterior models that can use efficient MAP approximations, such as those arising from linear program relaxations.

map prediction, posterior distribution, posterior model, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > New York (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
Asia > Middle East > Israel > Haifa District > Haifa (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.49)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)

Add feedback

Learning Efficient Random Maximum A-Posteriori Predictors with Non-Decomposable Loss Functions

Hazan, Tamir, Maji, Subhransu, Keshet, Joseph, Jaakkola, Tommi

Neural Information Processing SystemsDec-31-2013

In this work we develop efficient methods for learning random MAP predictors for structured label problems. In particular, we construct posterior distributions over perturbations that can be adjusted via stochastic gradient methods. We show that every smooth posterior distribution would suffice to define a smooth PAC-Bayesian risk bound suitable for gradient methods. In addition, we relate the posterior distributions to computational properties of the MAP predictors. We suggest multiplicative posteriors to learn super-modular potential functions that accompany specialized MAP predictors such as graph-cuts. We also describe label-augmented posterior models that can use efficient MAP approximations, such as those arising from linear program relaxations.

Add feedback

Coding Time-Varying Signals Using Sparse, Shift-Invariant Representations

Lewicki, Michael S., Sejnowski, Terrence J.

Neural Information Processing SystemsDec-31-1999

A common way to represent a time series is to divide it into shortduration blocks, each of which is then represented by a set of basis functions. A limitation of this approach, however, is that the temporal alignment of the basis functions with the underlying structure in the time series is arbitrary. We present an algorithm for encoding a time series that does not require blocking the data. The algorithm finds an efficient representation by inferring the best temporal positions for functions in a kernel basis. These can have arbitrary temporal extent and are not constrained to be orthogonal.

coefficient, kernel function, representation, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Santa Clara County > Stanford (0.04)
North America > United States > California > San Diego County > La Jolla (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.37)

Add feedback

Coding Time-Varying Signals Using Sparse, Shift-Invariant Representations

Lewicki, Michael S., Sejnowski, Terrence J.

Neural Information Processing SystemsDec-31-1999

A common way to represent a time series is to divide it into shortduration blocks, each of which is then represented by a set of basis functions. A limitation of this approach, however, is that the temporal alignment of the basis functions with the underlying structure in the time series is arbitrary. We present an algorithm for encoding a time series that does not require blocking the data. The algorithm finds an efficient representation by inferring the best temporal positions for functions in a kernel basis. These can have arbitrary temporal extent and are not constrained to be orthogonal.

coefficient, kernel function, representation, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Santa Clara County > Stanford (0.04)
North America > United States > California > San Diego County > La Jolla (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.37)

Add feedback

Coding Time-Varying Signals Using Sparse, Shift-Invariant Representations

Lewicki, Michael S., Sejnowski, Terrence J.

Neural Information Processing SystemsDec-31-1999

A common way to represent a time series is to divide it into shortduration blocks,each of which is then represented by a set of basis functions. A limitation of this approach, however, is that the temporal alignmentof the basis functions with the underlying structure in the time series is arbitrary. We present an algorithm for encoding a time series that does not require blocking the data. The algorithm finds an efficient representation by inferring the best temporal positions forfunctions in a kernel basis. These can have arbitrary temporal extent and are not constrained to be orthogonal.

artificial intelligence, kernel function, machine learning, (14 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.37)

Add feedback

Filters

Collaborating Authors

computed efficiently

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

193002e668758ea9762904da1a22337c-Supplemental.pdf

Bisimulation Metrics are Optimal Transport Distances, and Can be Computed Efficiently

Injective Flows for parametric hypersurfaces

Learning Efficient Random Maximum A-Posteriori Predictors with Non-Decomposable Loss Functions

Learning Efficient Random Maximum A-Posteriori Predictors with Non-Decomposable Loss Functions

Coding Time-Varying Signals Using Sparse, Shift-Invariant Representations

Coding Time-Varying Signals Using Sparse, Shift-Invariant Representations

Coding Time-Varying Signals Using Sparse, Shift-Invariant Representations