AITopics | direct optimization

Models with discrete latent variables drive extensive research in machine learning applications, including language classification and generation [42, 11, 34], molecular synthesis [19], or game solving[25].

artificial intelligence, arxivpreprintarxiv, machine learning, (16 more...)

Neural Information Processing Systems

Country: North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)

Add feedback

Direct Optimization through \arg \max for Discrete Variational Auto-Encoder

Neural Information Processing SystemsDec-25-2025, 02:03:02 GMT

Reparameterization of variational auto-encoders with continuous random variables is an effective method for reducing the variance of their gradient estimates. In the discrete case, one can perform reparametrization using the Gumbel-Max trick, but the resulting objective relies on an $\arg \max$ operation and is non-differentiable. In contrast to previous works which resort to \emph{softmax}-based relaxations, we propose to optimize it directly by applying the \emph{direct loss minimization} approach. Our proposal extends naturally to structured discrete latent variable models when evaluating the $\arg \max$ operation is tractable. We demonstrate empirically the effectiveness of the direct loss minimization technique in variational autoencoders with both unstructured and structured discrete latent variables.

arg max, direct optimization, discrete variational auto-encoder, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.66)

Add feedback

Direct Policy Gradients: Direct Optimization of Policies in Discrete Action Spaces

Neural Information Processing SystemsDec-24-2025, 16:21:45 GMT

Direct optimization (McAllester et al., 2010; Song et al., 2016) is an appealing framework that replaces integration with optimization of a random objective for approximating gradients in models with discrete random variables (Lorberbom et al., 2018). A* sampling (Maddison et al., 2014) is a framework for optimizing such random objectives over large spaces. We show how to combine these techniques to yield a reinforcement learning algorithm that approximates a policy gradient by finding trajectories that optimize a random objective. We call the resulting algorithms \emph{direct policy gradient} (DirPG) algorithms. A main benefit of DirPG algorithms is that they allow the insertion of domain knowledge in the form of upper bounds on return-to-go at training time, like is used in heuristic search, while still directly computing a policy gradient. We further analyze their properties, showing there are cases where DirPG has an exponentially larger probability of sampling informative gradients compared to REINFORCE. We also show that there is a built-in variance reduction technique and that a parameter that was previously viewed as a numerical approximation can be interpreted as controlling risk sensitivity. Empirically, we evaluate the effect of key degrees of freedom and show that the algorithm performs well in illustrative domains compared to baselines.

direct optimization, direct policy gradient, name change, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.59)

Add feedback

Amortized Synthesis of Constrained Configurations Using a Differentiable Surrogate

Neural Information Processing SystemsDec-24-2025, 14:13:18 GMT

In design, fabrication, and control problems, we are often faced with the task of synthesis, in which we must generate an object or configuration that satisfies a set of constraints while maximizing one or more objective functions. The synthesis problem is typically characterized by a physical process in which many different realizations may achieve the goal. This many-to-one map presents challenges to the supervised learning of feed-forward synthesis, as the set of viable designs may have a complex structure. In addition, the non-differentiable nature of many physical simulations prevents efficient direct optimization. We address both of these problems with a two-stage neural network architecture that we may consider to be an autoencoder.

amortized synthesis, constrained configuration, synthesis, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.96)

Add feedback

Export Reviews, Discussions, Author Feedback and Meta-Reviews

Neural Information Processing SystemsOct-2-2025, 17:42:37 GMT

First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. The paper is aiming to speed up optimization for the of graphical model MAP problem. The authors suggest a coarse to fine approach and clearly motivate their work. The method suggested by the authors combines label pruning for each variable with sequential optimization of the pruned, yet finer problem. As far as i can tell, the work is original and correct.

algorithm, experiment, optimization, (10 more...)

Neural Information Processing Systems

Country: North America > Canada > Quebec > Montreal (0.05)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.48)

Add feedback

Direct Optimization through $\arg \max$ for Discrete Variational Auto-Encoder

Guy Lorberbom, Andreea Gane, Tommi Jaakkola, Tamir Hazan

Neural Information Processing SystemsOct-2-2025, 06:26:58 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, machine learning, opération, (14 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Differentiable Sparsity via $D$-Gating: Simple and Versatile Structured Penalization

Kolb, Chris, Frost, Laetitia, Bischl, Bernd, Rügamer, David

arXiv.org Machine LearningOct-1-2025

Structured sparsity regularization offers a principled way to compact neural networks, but its non-differentiability breaks compatibility with conventional stochastic gradient descent and requires either specialized optimizers or additional post-hoc pruning without formal guarantees. In this work, we propose $D$-Gating, a fully differentiable structured overparameterization that splits each group of weights into a primary weight vector and multiple scalar gating factors. We prove that any local minimum under $D$-Gating is also a local minimum using non-smooth structured $L_{2,2/D}$ penalization, and further show that the $D$-Gating objective converges at least exponentially fast to the $L_{2,2/D}$-regularized loss in the gradient flow limit. Together, our results show that $D$-Gating is theoretically equivalent to solving the original group sparsity problem, yet induces distinct learning dynamics that evolve from a non-sparse regime into sparse optimization. We validate our theory across vision, language, and tabular tasks, where $D$-Gating consistently delivers strong performance-sparsity tradeoffs and outperforms both direct optimization of structured penalties and conventional pruning baselines.

experiment, neural network, sparsity, (16 more...)

arXiv.org Machine Learning

2509.23898

Country:

Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
North America > United States > Massachusetts (0.04)
Europe > Sweden (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Direct Policy Gradients: Direct Optimization of Policies in Discrete Action Spaces

Neural Information Processing SystemsMay-27-2025, 12:32:47 GMT

Direct optimization (McAllester et al., 2010; Song et al., 2016) is an appealing framework that replaces integration with optimization of a random objective for approximating gradients in models with discrete random variables (Lorberbom et al., 2018). A* sampling (Maddison et al., 2014) is a framework for optimizing such random objectives over large spaces. We show how to combine these techniques to yield a reinforcement learning algorithm that approximates a policy gradient by finding trajectories that optimize a random objective. We call the resulting algorithms \emph{direct policy gradient} (DirPG) algorithms. A main benefit of DirPG algorithms is that they allow the insertion of domain knowledge in the form of upper bounds on return-to-go at training time, like is used in heuristic search, while still directly computing a policy gradient.

artificial intelligence, machine learning, reinforcement learning, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.62)

Add feedback

Review for NeurIPS paper: Direct Policy Gradients: Direct Optimization of Policies in Discrete Action Spaces

Neural Information Processing SystemsFeb-6-2025, 15:03:45 GMT

Additional Feedback: The motivating example could be explained more clearly. How exactly is the heuristic information incorporated into the search for a_dir? If a simulator is available, one typically wouldn't use a model-free algorithm like REINFORCE. A major benefit of REINFORCE is that it can do a Monte Carlo rollout and have an estimate of the direction to improve the policy without needing a simulator or a model of the environment. Once a simulator is added, it changes the structure of the problem such that different solution methods become available (i.e., MCTS).

direct policy gradient, discrete action space, simulator, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.62)

Add feedback

Reviews: Direct Optimization through \arg \max for Discrete Variational Auto-Encoder

Neural Information Processing SystemsJan-22-2025, 00:56:24 GMT

Originality To the best of my knowledge, the insight that the Gumbel-max trick can be combined with direct loss mimization in the VAE setting is novel. Moreover, the work that had to be done to get the ideas to play well together seems significant and original. Quality I have some reservations about the method presented, which is why I've given a slightly negative overall score. However, it seems very plausible that an author response could clarify my concerns and cause me to revise my score upward. Figure 1: -Epsilon and Tau are different variables in quite different settings.

baseline, direct optimization, discrete variational auto-encoder, (9 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.40)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.40)

Add feedback

Filters

Collaborating Authors

direct optimization

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Direct Optimization through $\arg \max$ for Discrete Variational Auto-Encoder

Direct Optimization through \arg \max for Discrete Variational Auto-Encoder

Direct Policy Gradients: Direct Optimization of Policies in Discrete Action Spaces

Amortized Synthesis of Constrained Configurations Using a Differentiable Surrogate

Export Reviews, Discussions, Author Feedback and Meta-Reviews

Direct Optimization through $\arg \max$ for Discrete Variational Auto-Encoder

Differentiable Sparsity via $D$-Gating: Simple and Versatile Structured Penalization

Direct Policy Gradients: Direct Optimization of Policies in Discrete Action Spaces

Review for NeurIPS paper: Direct Policy Gradients: Direct Optimization of Policies in Discrete Action Spaces

Reviews: Direct Optimization through \arg \max for Discrete Variational Auto-Encoder