AITopics | implicit parameterization

Understanding End-to-End Model-Based Reinforcement Learning Methods as Implicit Parameterization

Neural Information Processing SystemsApr-24-2026, 12:11:10 GMT

Estimating the per-state expected cumulative rewards is a critical aspect of reinforcement learning approaches, however the experience is obtained, but standard deep neural-network function-approximation methods are often inefficient in this setting. An alternative approach, exemplified by value iteration networks, is to learn transition and reward models of a latent Markov decision process whose value predictions fit the data. This approach has been shown empirically to converge faster to a more robust solution in many cases, but there has been little theoretical study of this phenomenon. In this paper, we explore such implicit representations of value functions via theory and focused experimentation. We prove that, for a linear parametrization, gradient descent converges to global optima despite nonlinearity and non-convexity introduced by the implicit representation. Furthermore, we derive convergence rates for both cases which allow us to identify conditions under which stochastic gradient descent (SGD) with this implicit representation converges substantially faster than its explicit counterpart. Finally, we provide empirical results in some simple domains that illustrate the theoretical findings.

machine learning, parameterization, reinforcement learning, (14 more...)

Neural Information Processing Systems

Country:

Europe (0.46)
North America > United States (0.28)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Add feedback

Understanding End-to-End Model-Based Reinforcement Learning Methods as Implicit Parameterization

Neural Information Processing SystemsApr-24-2026, 12:11:06 GMT

Estimating the per-state expected cumulative rewards is a critical aspect of reinforcement learning approaches, however the experience is obtained, but standard deep neural-network function-approximation methods are often inefficient in this setting. An alternative approach, exemplified by value iteration networks, is to learn transition and reward models of a latent Markov decision process whose value predictions fit the data. This approach has been shown empirically to converge faster to a more robust solution in many cases, but there has been little theoretical study of this phenomenon. In this paper, we explore such implicit representations of value functions via theory and focused experimentation. We prove that, for a linear parametrization, gradient descent converges to global optima despite nonlinearity and non-convexity introduced by the implicit representation. Furthermore, we derive convergence rates for both cases which allow us to identify conditions under which stochastic gradient descent (SGD) with this implicit representation converges substantially faster than its explicit counterpart. Finally, we provide empirical results in some simple domains that illustrate the theoretical findings.

machine learning, parameterization, reinforcement learning, (14 more...)

Neural Information Processing Systems

Country:

Europe (0.46)
North America > United States (0.28)

Genre: Research Report > New Finding (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Add feedback

UnderstandingEnd-to-EndModel-Based ReinforcementLearningMethodsasImplicit Parameterization

Neural Information Processing SystemsFeb-7-2026, 08:15:58 GMT

While knowntobesample efficient, these methods havefailed tofully leverage recent advances indeep learning, forcing the use of less efficient but more scalable model-free methods which try to learn the values directly.

machine learning, parameterization, reinforcement learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > Germany (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)

Add feedback

UnderstandingEnd-to-EndModel-Based ReinforcementLearningMethodsasImplicit Parameterization

Neural Information Processing SystemsFeb-7-2026, 08:15:54 GMT

While knowntobesample efficient, these methods havefailed tofully leverage recent advances indeep learning, forcing the use of less efficient but more scalable model-free methods which try to learn the values directly.

machine learning, parameterization, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > Germany (0.04)

Genre: Research Report > New Finding (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.69)

Add feedback

Understanding End-to-End Model-Based Reinforcement Learning Methods as Implicit Parameterization

Neural Information Processing SystemsDec-23-2025, 17:17:14 GMT

Estimating the per-state expected cumulative rewards is a critical aspect of reinforcement learning approaches, however the experience is obtained, but standard deep neural-network function-approximation methods are often inefficient in this setting. An alternative approach, exemplified by value iteration networks, is to learn transition and reward models of a latent Markov decision process whose value predictions fit the data. This approach has been shown empirically to converge faster to a more robust solution in many cases, but there has been little theoretical study of this phenomenon. In this paper, we explore such implicit representations of value functions via theory and focused experimentation. We prove that, for a linear parametrization, gradient descent converges to global optima despite non-linearity and non-convexity introduced by the implicit representation. Furthermore, we derive convergence rates for both cases which allow us to identify conditions under which stochastic gradient descent (SGD) with this implicit representation converges substantially faster than its explicit counterpart. Finally, we provide empirical results in some simple domains that illustrate the theoretical findings.

end-to-end model-based reinforcement learning method, implicit parameterization, name change, (4 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.84)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.81)

Add feedback

Understanding End-to-End Model-Based Reinforcement Learning Methods as Implicit Parameterization

Neural Information Processing SystemsMay-26-2025, 14:58:07 GMT

Estimating the per-state expected cumulative rewards is a critical aspect of reinforcement learning approaches, however the experience is obtained, but standard deep neural-network function-approximation methods are often inefficient in this setting. An alternative approach, exemplified by value iteration networks, is to learn transition and reward models of a latent Markov decision process whose value predictions fit the data. This approach has been shown empirically to converge faster to a more robust solution in many cases, but there has been little theoretical study of this phenomenon. In this paper, we explore such implicit representations of value functions via theory and focused experimentation. We prove that, for a linear parametrization, gradient descent converges to global optima despite non-linearity and non-convexity introduced by the implicit representation.

artificial intelligence, end-to-end model-based reinforcement learning method, machine learning, (3 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.49)

Add feedback

Understanding End-to-End Model-Based Reinforcement Learning Methods as Implicit Parameterization

Neural Information Processing SystemsOct-9-2024, 09:51:57 GMT

Estimating the per-state expected cumulative rewards is a critical aspect of reinforcement learning approaches, however the experience is obtained, but standard deep neural-network function-approximation methods are often inefficient in this setting. An alternative approach, exemplified by value iteration networks, is to learn transition and reward models of a latent Markov decision process whose value predictions fit the data. This approach has been shown empirically to converge faster to a more robust solution in many cases, but there has been little theoretical study of this phenomenon. In this paper, we explore such implicit representations of value functions via theory and focused experimentation. We prove that, for a linear parametrization, gradient descent converges to global optima despite non-linearity and non-convexity introduced by the implicit representation.

end-to-end model-based reinforcement learning method, implicit parameterization, implicit representation, (1 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.49)

Add feedback

Dissipative residual layers for unsupervised implicit parameterization of data manifolds

Reshniak, Viktor

arXiv.org Artificial IntelligenceOct-13-2022

We propose an unsupervised technique for implicit parameterization of data manifolds. In our approach, the data is assumed to belong to a lower dimensional manifold in a higher dimensional space, and the data points are viewed as the endpoints of the trajectories originating outside the manifold. Under this assumption, the data manifold is an attractive manifold of a dynamical system to be estimated. We parameterize such a dynamical system with a residual neural network and propose a spectral localization technique to ensure it is locally attractive in the vicinity of data. We also present initialization and additional regularization of the proposed residual layers. % that we call dissipative bottlenecks. We mention the importance of the considered problem for the tasks of reinforcement learning and support our discussion with examples demonstrating the performance of the proposed layers in denoising and generative tasks.

artificial intelligence, machine learning, manifold, (17 more...)

arXiv.org Artificial Intelligence

2210.071

Country:

North America > United States > Tennessee > Anderson County > Oak Ridge (0.04)
North America > United States > New York (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Filters

Collaborating Authors

implicit parameterization

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Understanding End-to-End Model-Based Reinforcement Learning Methods as Implicit Parameterization

Understanding End-to-End Model-Based Reinforcement Learning Methods as Implicit Parameterization

UnderstandingEnd-to-EndModel-Based ReinforcementLearningMethodsasImplicit Parameterization

UnderstandingEnd-to-EndModel-Based ReinforcementLearningMethodsasImplicit Parameterization

Understanding End-to-End Model-Based Reinforcement Learning Methods as Implicit Parameterization

Understanding End-to-End Model-Based Reinforcement Learning Methods as Implicit Parameterization

Understanding End-to-End Model-Based Reinforcement Learning Methods as Implicit Parameterization

Dissipative residual layers for unsupervised implicit parameterization of data manifolds