AITopics | projection oracle

Structured Prediction with Projection Oracles

Neural Information Processing SystemsDec-25-2025, 14:48:04 GMT

We propose in this paper a general framework for deriving loss functions for structured prediction. In our framework, the user chooses a convex set including the output space and provides an oracle for projecting onto that set. Given that oracle, our framework automatically generates a corresponding convex and smooth loss function. As we show, adding a projection as output layer provably makes the loss smaller. We identify the marginal polytope, the output space's convex hull, as the best convex set on which to project. However, because the projection onto the marginal polytope can sometimes be expensive to compute, we allow to use any convex superset instead, with potentially cheaper-to-compute projection. Since efficient projection algorithms are available for numerous convex sets, this allows us to construct loss functions for a variety of tasks. On the theoretical side, when combined with calibrated decoding, we prove that our loss functions can be used as a consistent surrogate for a (potentially non-convex) target loss function of interest. We demonstrate our losses on label ranking, ordinal regression and multilabel classification, confirming the improved accuracy enabled by projections.

loss function, name change, structured prediction, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Oracle-based Uniform Sampling from Convex Bodies

Dang, Thanh, Liang, Jiaming

arXiv.org Machine LearningOct-6-2025

We propose new Markov chain Monte Carlo algorithms to sample a uniform distribution on a convex body $K$. Our algorithms are based on the Alternating Sampling Framework/proximal sampler, which uses Gibbs sampling on an augmented distribution and assumes access to the so-called restricted Gaussian oracle (RGO). The key contribution of this work is the efficient implementation of RGO for uniform sampling on $K$ via rejection sampling and access to either a projection oracle or a separation oracle on $K$. In both oracle cases, we establish non-asymptotic complexities to obtain unbiased samples where the accuracy is measured in Rényi divergence or $χ^2$-divergence.

algorithm, oracle, rejection, (16 more...)

arXiv.org Machine Learning

2510.02983

Country: North America > United States > New York > Monroe County > Rochester (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Add feedback

Reviews: Structured Prediction with Projection Oracles

Neural Information Processing SystemsJan-24-2025, 23:04:42 GMT

Post-feedback update: Thanks for your update. Your additional explanations and results will help improve the paper, and I definitely think this work is strong and should be accepted. The framework itself is new, and the authors make it very clear how prior work fits into the framework as special cases. At the same time, a good case is made for why this framework is useful to have and how it can be better to use than prior losses. Quality: this paper makes a compelling case for the framework it introduces.

projection oracle, structured prediction, tradeoff

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.48)

Add feedback

Reviews: Structured Prediction with Projection Oracles

Neural Information Processing SystemsJan-24-2025, 23:04:32 GMT

All reviewers agreed that this paper make a nice contribution to NeurIPS by providing a novel general framework for generating calibrated surrogate loss functions for structured prediction problems. On the other hand, in discussion, they also stressed that including some baselines (e.g., SSVM/CRF approximation/SPEN) in the experiments and reporting runtimes could make this paper much stronger. The authors should implement their promised changes in the camera-ready version.

projection oracle, structured prediction

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (0.74)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.74)

Add feedback

Approximate FW Algorithm with a novel DMO method over Graph-structured Support Set

Pan, Yijian, Qiang, Hongjiao

arXiv.org Artificial IntelligenceNov-25-2024

In this project, we reviewed a paper that deals graph-structured convex optimization (GSCO) problem with the approximate Frank-Wolfe (FW) algorithm. We analyzed and re-implemented the original algorithm and introduced some extensions based on that. Then we conducted experiments to compare the results and concluded that our backtracking line-search method effectively reduced the number of iterations, while our new DMO method (Top-g+ optimal visiting) did not make satisfying enough improvements.

frank-wolfe method, function value, fw method, (13 more...)

arXiv.org Artificial Intelligence

2411.04389

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.49)
Information Technology > Artificial Intelligence > Machine Learning (0.31)

Add feedback

Structured Prediction with Projection Oracles

Neural Information Processing SystemsOct-10-2024, 08:26:19 GMT

We propose in this paper a general framework for deriving loss functions for structured prediction. In our framework, the user chooses a convex set including the output space and provides an oracle for projecting onto that set. Given that oracle, our framework automatically generates a corresponding convex and smooth loss function. As we show, adding a projection as output layer provably makes the loss smaller. We identify the marginal polytope, the output space's convex hull, as the best convex set on which to project.

loss function, projection oracle, structured prediction, (3 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (0.65)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.65)

Add feedback

Primal Methods for Variational Inequality Problems with Functional Constraints

Zhang, Liang, He, Niao, Muehlebach, Michael

arXiv.org Machine LearningMar-19-2024

Constrained variational inequality problems are recognized for their broad applications across various fields including machine learning and operations research. First-order methods have emerged as the standard approach for solving these problems due to their simplicity and scalability. However, they typically rely on projection or linear minimization oracles to navigate the feasible set, which becomes computationally expensive in practical scenarios featuring multiple functional constraints. Existing efforts to tackle such functional constrained variational inequality problems have centered on primal-dual algorithms grounded in the Lagrangian function. These algorithms along with their theoretical analysis often require the existence and prior knowledge of the optimal Lagrange multipliers. In this work, we propose a simple primal method, termed Constrained Gradient Method (CGM), for addressing functional constrained variational inequality problems, without necessitating any information on the optimal Lagrange multipliers. We establish a non-asymptotic convergence analysis of the algorithm for variational inequality problems with monotone operators under smooth constraints. Remarkably, our algorithms match the complexity of projection-based methods in terms of operator queries for both monotone and strongly monotone settings, while utilizing significantly cheaper oracles based on quadratic programming. Furthermore, we provide several numerical examples to evaluate the efficacy of our algorithms.

algorithm, constraint, variational inequality problem, (9 more...)

arXiv.org Machine Learning

2403.12859

Country:

Asia > Middle East > Jordan (0.05)
North America > United States > New York (0.04)
North America > United States > New Jersey (0.04)
(6 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Low-rank matrix recovery with non-quadratic loss: projected gradient method and regularity projection oracle

Ding, Lijun, Zhang, Yuqian, Chen, Yudong

arXiv.org Machine LearningAug-31-2020

Existing results for low-rank matrix recovery largely focus on quadratic loss, which enjoys favorable properties such as restricted strong convexity/smoothness (RSC/RSM) and well conditioning over all low rank matrices. However, many interesting problems involve non-quadratic loss do not satisfy such properties; examples including one-bit matrix sensing, one-bit matrix completion, and rank aggregation. For these problems, standard nonconvex approaches such as projected gradient with rank constraint alone (a.k.a. iterative hard thresholding) and Burer-Monteiro approach may perform badly in practice and have no satisfactory theory in guaranteeing global and efficient convergence. In this paper, we show that the critical component in low-rank recovery with non-quadratic loss is a regularity projection oracle, which restricts iterates to low-rank matrix within an appropriate bounded set, over which the loss function is well behaved and satisfies a set of relaxed RSC/RSM conditions. Accordingly, we analyze an (averaged) projected gradient method equipped with such an oracle, and prove that it converges globally and linearly. Our results apply to a wide range of non-quadratic problems including rank aggregation, one bit matrix sensing/completion, and more broadly generalized linear models with rank constraint.

artificial intelligence, machine learning, matrix completion, (15 more...)

arXiv.org Machine Learning

2008.13777

Country:

North America > United States > New York > Tompkins County > Ithaca (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > New Jersey > Middlesex County > Piscataway (0.04)
Asia > South Korea > Seoul > Seoul (0.04)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Structured Prediction with Projection Oracles

Blondel, Mathieu

Neural Information Processing SystemsMar-19-2020, 01:33:06 GMT

We propose in this paper a general framework for deriving loss functions for structured prediction. In our framework, the user chooses a convex set including the output space and provides an oracle for projecting onto that set. Given that oracle, our framework automatically generates a corresponding convex and smooth loss function. As we show, adding a projection as output layer provably makes the loss smaller. We identify the marginal polytope, the output space's convex hull, as the best convex set on which to project. However, because the projection onto the marginal polytope can sometimes be expensive to compute, we allow to use any convex superset instead, with potentially cheaper-to-compute projection.

artificial intelligence, inductive learning, machine learning, (6 more...)

Neural Information Processing Systems

Technology: