AITopics | coordinate ascent

82ca5dd156cc926b2992f73c2896f761-Supplemental.pdf

Neural Information Processing SystemsAug-15-2025, 13:26:56 GMT

artificial intelligence, geoadex, machine learning, (19 more...)

Neural Information Processing Systems

Country: Asia > Middle East > Jordan (0.05)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)

Add feedback

Reviews: Asynchronous Parallel Coordinate Minimization for MAP Inference

Neural Information Processing SystemsOct-7-2024, 22:13:07 GMT

Summary: This paper proposes an asynchronous parallel approximate algorithm for MAP inference in graphical models represented as factor graphs. The proposed method is based on dual decomposition which breaks the model into overlapping pieces and adds penalty terms to enforce agreement between the overlapping portions. Whereas, HOGWILD performs asynchronous gradient updates at each factor, the proposed method performs full coordinate ascent at each iteration. The main concern is that updates based on stale values will be invalid, however, the authors show results that bound expected errors of this type. The authors also provide some methods for adaptively choosing the number of worker nodes to further minimize this error.

asynchronous parallel coordinate minimization, coordinate ascent, map inference, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.59)

Add feedback

Solving Multi-Model MDPs by Coordinate Ascent and Dynamic Programming

Su, Xihong, Petrik, Marek

arXiv.org Artificial IntelligenceJul-8-2024

Multi-model Markov decision process (MMDP) is a promising framework for computing policies that are robust to parameter uncertainty in MDPs. MMDPs aim to find a policy that maximizes the expected return over a distribution of MDP models. Because MMDPs are NP-hard to solve, most methods resort to approximations. In this paper, we derive the policy gradient of MMDPs and propose CADP, which combines a coordinate ascent method and a dynamic programming algorithm for solving MMDPs. The main innovation of CADP compared with earlier algorithms is to take the coordinate ascent perspective to adjust model weights iteratively to guarantee monotone policy improvements to a local maximum. A theoretical analysis of CADP proves that it never performs worse than previous dynamic programming algorithms like WSU. Our numerical results indicate that CADP substantially outperforms existing methods on several benchmark problems.

algorithm, cadp, mmdp, (14 more...)

arXiv.org Artificial Intelligence

2407.06329

Country:

North America > United States > New Hampshire (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.64)

Industry: Health & Medicine > Therapeutic Area (0.51)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.91)

Add feedback

Feature selection with optimal coordinate ascent (OCA)

Saltiel, David, Benhamou, Eric

arXiv.org Machine LearningDec-3-2018

In machine learning, Feature Selection (FS) is a major part of efficient algorithm. It fuels the algorithm and is the starting block for our prediction. In this paper, we present a new method, called Optimal Coordinate Ascent (OCA) that allows us selecting features among block and individual features. OCA relies on coordinate ascent to find an optimal solution for gradient boosting methods score (number of correctly classified samples). OCA takes into account the notion of dependencies between variables forming blocks in our optimization. The coordinate ascent optimization solves the issue of the NP hard original problem where the number of combinations rapidly explode making a grid search unfeasible. It reduces considerably the number of iterations changing this NP hard problem into a polynomial search one. OCA brings substantial differences and improvements compared to previous coordinate ascent feature selection method: we group variables into block and individual variables instead of a binary selection. Our initial guess is based on the k-best group variables making our initial point more robust. We also introduced new stopping criteria making our optimization faster. We compare these two methods on our data set. We found that our method outperforms the initial one. We also compare our method to the Recursive Feature Elimination (RFE) method and find that OCA leads to the minimum feature set with the highest score. This is a nice byproduct of our method as it provides empirically the most compact data set with optimal performance.

artificial intelligence, machine learning, selection, (15 more...)

arXiv.org Machine Learning

1811.12064

Country: North America > United States (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.67)

Add feedback

Analysis of boosting algorithms using the smooth margin function

Rudin, Cynthia, Schapire, Robert E., Daubechies, Ingrid

arXiv.org Machine LearningMar-28-2008

We introduce a useful tool for analyzing boosting algorithms called the ``smooth margin function,'' a differentiable approximation of the usual margin for boosting algorithms. We present two boosting algorithms based on this smooth margin, ``coordinate ascent boosting'' and ``approximate coordinate ascent boosting,'' which are similar to Freund and Schapire's AdaBoost algorithm and Breiman's arc-gv algorithm. We give convergence rates to the maximum margin solution for both of our algorithms and for arc-gv. We then study AdaBoost's convergence properties using the smooth margin function. We precisely bound the margin attained by AdaBoost when the edges of the weak classifiers fall within a specified range. This shows that a previous bound proved by R\"{a}tsch and Warmuth is exactly tight. Furthermore, we use the smooth margin to capture explicit properties of AdaBoost in cases where cyclic behavior occurs.

adaboost, algorithm, iteration, (14 more...)

arXiv.org Machine Learning

doi: 10.1214/009053607000000785

0803.4092

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > New Jersey > Mercer County > Princeton (0.04)
(4 more...)

Genre:

Research Report > Experimental Study (0.67)
Research Report > New Finding (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

On the Dynamics of Boosting

Rudin, Cynthia, Daubechies, Ingrid, Schapire, Robert E.

Neural Information Processing SystemsDec-31-2004

In order to understand AdaBoost's dynamics, especially its ability to maximize margins, we derive an associated simplified nonlinear iterated map and analyze its behavior in low-dimensional cases. We find stable cycles for these cases, which can explicitly be used to solve for Ada-Boost's output. By considering AdaBoost as a dynamical system, we are able to prove Rätsch and Warmuth's conjecture that AdaBoost may fail to converge to a maximal-margin combined classifier when given a'nonoptimal' weak learning algorithm.

adaboost, algorithm, classifier, (14 more...)

Neural Information Processing Systems

Country: North America > United States > New Jersey > Mercer County > Princeton (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.34)

Add feedback

On the Dynamics of Boosting

Rudin, Cynthia, Daubechies, Ingrid, Schapire, Robert E.

Neural Information Processing SystemsDec-31-2004

In order to understand AdaBoost's dynamics, especially its ability to maximize margins, we derive an associated simplified nonlinear iterated map and analyze its behavior in low-dimensional cases. We find stable cycles for these cases, which can explicitly be used to solve for Ada-Boost's output. By considering AdaBoost as a dynamical system, we are able to prove Rätsch and Warmuth's conjecture that AdaBoost may fail to converge to a maximal-margin combined classifier when given a'nonoptimal' weak learning algorithm.

adaboost, algorithm, classifier, (14 more...)

Neural Information Processing Systems

Country: North America > United States > New Jersey > Mercer County > Princeton (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.34)

Add feedback

On the Dynamics of Boosting

Rudin, Cynthia, Daubechies, Ingrid, Schapire, Robert E.

Neural Information Processing SystemsDec-31-2004

In order to understand AdaBoost's dynamics, especially its ability to maximize margins, we derive an associated simplified nonlinear iterated map and analyze its behavior in low-dimensional cases. We find stable cycles for these cases, which can explicitly be used to solve for Ada-Boost's output. By considering AdaBoost as a dynamical system, we are able to prove Rätsch and Warmuth's conjecture that AdaBoost may fail to converge to a maximal-margin combined classifier when given a'nonoptimal' weaklearning algorithm.

adaboost, algorithm, classifier, (14 more...)

Neural Information Processing Systems

Country: North America > United States > New Jersey > Mercer County > Princeton (0.04)

Technology: